Wikipedia talk:Plagiarism/Archive 4
This is an archive of past discussions about Wikipedia:Plagiarism. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 | Archive 6 | → | Archive 10 |
Length of paraphrased passages
We need a statement that makes clear that the length of a closely paraphrased passage has an effect on whether it is plagiarism or not. It is absolutely clear that this is still blatant plagiarism (and copyvio), despite the "paraphrasing". However, a close paraphrase of a single sentence from a source, used to present a succinct thought expressed in that source as part of a much longer article, would not raise concern if properly attributed.
For example, if Smith says, "The Nixon administration's attempt to deport Lennon was motivated by fear that his actions would disrupt the Republican convention", then I can jolly well write, "Smith notes that the Nixon administration tried to deport Lennon to prevent him from disrupting the Republican convention.(Smith, p. 12)" if my article has 50 different sources, and I only quote two or three sentences from Smith. What I can't do is string twenty paraphrased sentences together like this, based on a string of content-wise identical twenty sentences in Smith. Jayen466 16:19, 21 May 2009 (UTC)
- I'm not sure length is really the key determinant here. Your above example is a pretty clear rewrite, and also the wording in the original is relatively innocuous (i.e. it's extremely straightforward). Even very short prose which is more "flowery," so to speak, could however pose more of a problem if we copy too closely. So if the original had said, "Nixon, in his typically paranoid style, sought to deport John Lennon, for whom he had contempt as a peace, love, dope hippie," we could not say, "Nixon, acting typically paranoid, had contempt for John Lennon as a peace, love, dope hippie and sought to deport him." In my view at least that's plagiarism even though it's one sentence, because the original descriptions of Nixon as "typically paranoid" and Lennon as a "peace, love, dope hippie" are creative and unique, and we could not use that exact wording. To me that's more the issue than how long something is, and I definitely don't want to give the impression (and I'm sure you don't either) that plagiarism is okay if it's just a sentence. Usually it's pretty easy to rewrite even a short sentence such that we are far afield from plagiarism. --Bigtimepeace | talk | contribs 19:01, 21 May 2009 (UTC)
- Note that what I gave above is a "close paraphrase" of the sort that is castigated as inappropriate here. JN466 21:46, 22 May 2009 (UTC)
- What you gave above is attributed inline. "Smith notes...." There's nothing like that in the example you linked. It's a different situation altogether. --Moonriddengirl (talk) 01:23, 23 May 2009 (UTC)
- Attribution is not the only issue here. If I write, "Smith says ..." and then produce 5 paragraphs, in practically the same sequence as in Smith's work, and with individual sentences altered roughly to the same degree as in the example sentence above, then I think I would be committing plagiarism. Would you agree? JN466 09:49, 23 May 2009 (UTC)
- Yes, I would agree, but the attribution is still important. It is my experience (by which I mean WP:OR. I have no source to back me; I hope I don't have to find one, because I've got a lot to do today :)) that when you explicitly attribute, you can follow more closely in limited runs than you can if you don't. Plus, in the example in the close paraphrasing essay, you've got runs of identical text with single-word substitution and other issues: "is a small predatory carnivorous species of crepuscular mammal"; "is a small predatory meat-eating species of crepuscular mammal." But in addition to the notes made by User:Bigtimepeace above, there are other concerns about addressing length of material paraphrased. Even if one badly paraphrased sentence from a single source were not a serious concern for plagiarism, a whole slew of badly paraphrased sentences from a whole slew of sources could be. (I see a lot of material at WP:CP where contributors have cherry-picked sentences from multiple sources and sewn them together.). --Moonriddengirl (talk) 13:30, 23 May 2009 (UTC)
- Attribution is not the only issue here. If I write, "Smith says ..." and then produce 5 paragraphs, in practically the same sequence as in Smith's work, and with individual sentences altered roughly to the same degree as in the example sentence above, then I think I would be committing plagiarism. Would you agree? JN466 09:49, 23 May 2009 (UTC)
- What you gave above is attributed inline. "Smith notes...." There's nothing like that in the example you linked. It's a different situation altogether. --Moonriddengirl (talk) 01:23, 23 May 2009 (UTC)
- Note that what I gave above is a "close paraphrase" of the sort that is castigated as inappropriate here. JN466 21:46, 22 May 2009 (UTC)
<- Do you think the following roughly expresses what we are trying to get at?
- Close paraphrasing
Basing substantial sections of article text on close paraphrases of one or several sources is improper, even if you cite the sources. When writing articles, you should generally use your own words and sentences, without copying your sources' sentence and paragraph structure, diction and style. The knack is to describe the same facts without borrowing the creative expression of the source you are citing. Note however that close paraphrases may be appropriate in some limited contexts. A sentence stating that composer Smith was mainly influenced by the work of Jones, Brown, and White, cannot move far away from the original diction without losing precision. It will always bear a substantial similarity to the source wording. Another example is where an article summarises comments made by a critic or scholar who is explicitly named in the article. Here it makes sense to stay close to the commentator's actual words, so as to avoid the risk of misrepresentation. In fact, consider whether the reader would not be better served by a direct quotation, properly cited and enclosed in quotation marks. Even if you decide against a direct quotation, you should still mark any creative or original expressions that you want to reuse in quotation marks. And whether you use a direct quotation or a close paraphrase, be sure to attribute the material, using an in-line citation.
Thoughts? JN466 21:16, 24 May 2009 (UTC)
- I think any text suggesting that "substantial similarity" is acceptable is a real issue, since substantial similarity is the legal term used to define the point at which resemblance to another source crosses the line into copyright infringement. :D On the whole I like this, with some minor misgivings, but my question, really, is to what degree close paraphrasing needs to be explained here. It's a difficult concept to cover, and it's important not to inadvertently mislead people into thinking that we can closely paraphrase text just because we fear losing precision. We can't, if that text is highly creative. You say "limited contexts", but I promise you that the text plagiarism evaluators are going to have quoted back to them is "cannot move far away from the original diction without losing precision." (I hear that already, even in some pretty cut and dried cases.) I would think, since some people think that this is already too long, that sentence we have ("It can also be useful to do a direct comparison between cited sources and text within the article, to see if text has been plagiarized, including too-close paraphrasing of the original.") is probably sufficient, with further development at Wikipedia:Close paraphrasing, which can and (imho) should be cited in the "see also" just like the Wikipedia:Quotations essay (or even linked in the body). We have a lot more room at the close paraphrasing essay to give examples and explain more completely. --Moonriddengirl (talk) 21:33, 24 May 2009 (UTC)
Text-Based Copyright summary
I added a section about copyright, but this guideline is already very long. I'm actually thinking of proposing that summary as an official policy. I've been reading these copyright-related pages for quite some time and finding a summary of textual copyright policy was not easy. There are additional considerations that could be added to such a policy, see User:Moonriddengirl#Text based copyright concerns. For now, I'm just going to move the summary material to Wikipedia:Copyrights/Text, and let you decide what to do with it. This is quite bold, I realize, so I apologize if this is inappropriate, but I will just borrow layout from other proposals, please fix anything I do wrong. CopoCop (talk) 23:21, 25 May 2009 (UTC)
Plagiarism problems
Don't we need to set up a review mechanism, a wikipedia:Plagiarism problems for reporting and addressing plagiarism problems like is done for copyright issues at wikipedia:Copyright problems. It would investigate reported problems and could systematically look at contributions of selected users. I think we should start this now. This would build up a case history, and develop practices for dealing with this difficult issue and users who have long held beliefs that are inconsistent with this new guideline. Of course, the administration of this review/correction process must be sympathetic and constructive until the guideline is much better known and understood and accepted. I think there will be plenty of business, for this service. And, I think it has to be separate from the copyright problems service, which requires different expertise and addresses different issues. doncram (talk) 15:06, 22 May 2009 (UTC)
- I think it is extremely important, if we do so, that we make very clear that copyright infringement is a different matter...and a more imperative one. Plagiarism can be repaired slowly; probable copyright infringement must be repaired immediately or blanked. I've seen the term "plagiarism" used often when the actual issue is "copryight." I'll warn you, though, that you may have trouble recruiting enough eyes to check for plagiarism across full contribs. We've been trying to figure out how to do that at WP:COPYCLEAN for copyright violations. If we had a good process for that, some of this recent turmoil could have been avoided. --Moonriddengirl (talk) 15:14, 22 May 2009 (UTC)
- Sure, on most of what you say. Except I don't know what recent turmoil you refer to as having been avoidable. Even if we had an extensive, prompt, perfect process on dealing with copyright violations, there would still be a need for a separate service on addressing plagiarism problems, which is different (for example when there is extensive verbatim copying of public domain or GFDL text). Copyright violation enforcement has to stay focused on the legality of it and would not touch such situations. In other situations there will be both plagiarism and copyvio issues, and some coordination/referral/transfer process between the two services will need to be worked out. doncram (talk) 15:31, 22 May 2009 (UTC)
- Sorry. I wasn't trying to be obscure, just discreet. In the interest of that, I'll just drop it. I don't disagree that a board to examine plagiarism is a good idea. I just hope that it can be implemented in such a way that more imperative issues do not remain in publication while listed for examination because contributors do not recognize that potential copyright violation, rather than plagiarism, is what they're dealing with. Maybe it would be a good idea to specify that a plagiarism board is for free text? --Moonriddengirl (talk) 17:20, 22 May 2009 (UTC)
Hello, I suggest you call these things copy-paste problems. I think plagiarism is a very harsh word. I still think it is serious, but I would prefer not labeling it such. I'm impressed that there are people here that take this seriously. 217.189.255.152 (talk) 16:22, 22 May 2009 (UTC)
- Not all plagiarism problems are simply matters of copy-pasting, however. I don't think we want to set up two processes, one for obvious copy-paste problems (which would overlap with copyright issues, too), and one for less obvious plagiarism. I appreciate the thrust of your suggestion to avoid unduly antagonistic language, but I would try to take the extreme sting out of assertions that certain edits constitute plagiarism. The working definition of plagiarism that i prefer is: plagiarism is when less credit is given than is reasonably expected to be due. There's lots of gray, about how much credit is given, about how much is due, about what is reasonably expected by the most readers, etc. In fact it's almost all gray, not black-and-white. I happen to think the academic world does a disservice to students and everyone else, by often making it out plagiarism as all black-and-white, and overly stigmatizing anyone who is caught up in accusations about it. Accusations are clubs to beat up on students or on other academics, and the accusations are usually debatable and/or hypocritical, from my experience. So I would rather just call it Plagiarism problems and be sensitive, however, in discussing the issues in each case that comes up. doncram (talk) 17:48, 22 May 2009 (UTC)
- Yes, you are right. That is a very sensible definition and probably the right approach. 217.189.255.152 (talk) 18:46, 22 May 2009 (UTC)
I think the idea of a noticeboard is a good one, as it indicates very clearly "this is the place to bring your concerns". That lets us separate the practical ongoing concerns and resolutions from the more philosophical issue of what this guideline should actually say. MRG has two very good points though: where will the eyeballs come from, and what about the overlap with the copyright noticeboard? On the first, I'll volunteer to at least kind of squint, I've already been trying to do that when concerns are brought here. Doncram and MRG are presumably on-board, though Moonie at least is already busy enough with copyright. Presumably too, we could recruit a few more people through publicity. On the second issue, copyvio is copyvio and has to go right away, no matter what board it's brought to. We already successfully piled on an editor just a few hours ago when concerns were brought here. Some of that was just straight copyvio, and it's being dealt with. More complex cases can be transferred over to the copyboard. There is still a risk that an individual problem will be dealt with as such and an editor with multiple violations will be missed, and that's a concern.
If there is to be a noticeboard, I suppose it will need to be worded as a proposal and raised at WP:VPR and maybe WP:CENT. Doncram, are you up for that? The exact name can be worked out at the proposal stage. An alternative would be to just extend the function of the copyright noticeboard to include plagio, though that runs the risk of diluting what the copyvio NB is all about. Franamax (talk) 01:36, 23 May 2009 (UTC)
- I can see advantages and disadvantages to combining. Advantages: there is often overlap between the two and cleaning them may sometimes take complementary skillsets. Putting them in one place eliminates accidental mislistings of one for the other and keeps us from dividing a small labor pool. Disadvantage: copyright issues must be prioritized, since these are a legal concern. When I first got to CP sometime last summer (I think), it had a multiple week backlog because there wasn't enough attention to it. Adding to the workload without adding to the workers is likely to create big problems. Plus, copyright infringement at this point is meant to be resolved by an administrator. Plagiarism can be resolved by anyone. Where I would think overlap would be helpful is in contribution checks. The CP board definitely doesn't have the resources to do extensive looking into other contributions by potential problem editors. Currently, we are listing these at WT:COPYCLEAN, but we have been trying to come up with a better process. It would be great if, in addition to article listings, there were a way to request contribution checks for plagiarism and/or copyvio concerns. Plagiarized articles could be listed for cleaning at the plagiarism board, while copyvio concerns could be blanked as usual with the {{copyvio}} template. --Moonriddengirl (talk) 02:16, 23 May 2009 (UTC)
- Yes i would be interested in participating. I would want to keep the scope small: to run a pilot project board for a while, perhaps for just a limited number of test cases, perhaps picking and choosing representative cases to deal with, rather than take on anything too overwhelming or boring (such as addressing thousands of past contributions by just one user). Like, it could strategically choose to address one or a few articles in, say, the historic sites area which i am familiar with, and hope/expect to bring on board members of relevant wikiprojects (wp:NRHP and wp:HSITES in that area) to extend the impact and to extend the discussion and learning as it applies to that subject area. And likewise take on cases in other subject areas in order to work with other groups of editors. I think that taking on a few cases would prove valuable in clarifying about what needs to be in the general guideline here. It will really test matters, when you are trying to explain to some editor that a certain practice is not acceptable, and when you find that the guideline we have in place doesn't really address that yet, or is ambiguous about it, and/or that we who are trying to run a board have disagreements among ourselves. Also it will identify gaps in subject-area or project-specific guidelines which local editors can then address. Also I think/hope that having a separate plagiarism problems board would in the end help ease the workload at the copyright problems board, by sometimes allowing referral/removal of matters best handled by the other, to the other. By the way, a seemingly toothless board would still have power that we bring to it, simply to attempt to persuade and to use existing tools like AfD and project-specific Talk areas. If a proposal needs to be made at VPP or elsewhere, please someone else go ahead and make it, and then I will participate in the discussion. doncram (talk) 05:51, 23 May 2009 (UTC)
We need to get this into perspective. WikiPedia is not an academic publication. An academic publication has named authors who are explicitly asserting that the publication is their original work and expecting credit from it. Academic publications in general are meant to contain original research. WikiPedia articles are completely different, they are prepared as a service to readers by an arbitraily large number of editors, and OR is explicitly prohibited. What exaclty is the problem that this anti-"plagarism" campaign is meant to be solving? NBeale (talk) 07:06, 23 May 2009 (UTC)
- The exact problem is addressed above in the RfC which promoted this to guideline. The questions you raise about encyclopedic versus academic standards and about OR were also raised there. It has not yet archived, so those conversations are still visible at the top of the page if you are interested in how others felt and/or responded to such concerns. --Moonriddengirl (talk) 13:45, 23 May 2009 (UTC)
- Durova referred to this in her post above, under #Promotion to guideline, explaining why she felt it was time for this to have guideline status. JN466 15:56, 23 May 2009 (UTC)
(outdent) I'm starting up a draft "Plagiarism problems" page, for now at Wikipedia:Plagiarism/Plagiarism problems, modelling on the "Copyright problems" page. I suggest discussion of this page draft should be at its Talk page, Wikipedia talk:Plagiarism/Plagiarism problems. There are templates and instructions and lots else mechanical to set up before this could be considered ready for launch on a trial basis. I suggest that when it is mechanically ready, that discussion of launch should be held here at wt:Plagiarism with cross-posting to Village pump, to Talk of the Copyrights problems page, and elsewhere. Am certainly open to input on how this should be done, and am certainly aware that in any problem discussion huge amounts of gray area need to be acknowledged. But I do think that getting started is important, and the first few problems taken on will be valuable for clarifying our thinking in this plagiarism guideline and in supporting essays and so on. doncram (talk) 01:25, 12 June 2009 (UTC)
Readers being aware of who wrote what??
The sentence "Wikipedia needs proper attribution of those sources to ensure that readers are aware of who actually authored the material they see and read." is fundamentally misconceived, as I have noted above:
- Readers of Wikipedia are perfectly aware that the concept of "who actually authored the material" does not really apply to Wikipedia articles. All they know, or care about, is that at least one editor thought that the words they are reading should be inserted and either no editors thought they should be deleted or that if they did others disagreed and the current version has the words in.
- Most edits are done anonymously so in most cases readers have no idea who actually authored the material they see and read.
What readers do need is some way of seeing what the "reliable source" on which a statement is based actually is, and preferably (if it is on the web) some way of telling whether the source actually supports what is in the article or not. This, of course, has nothing to do with plagarism per se.
If there is a plagarism issue that is worth addressing in Wikipedia it is whether the author of the original words would reasonably be aggrieved if they knew they were quoted without specific attribution. This is an issue of judgement. Certainly quoting someone without giving the source is bad practice, but most authors are happy to have as many pointers to their work from other sources as possible. The key question seems to me to be whether the contribution of the author in question was sufficiently novel, striking and important for their name to be attached specifically to it. So if someone compiles a list of facts about someone for an obituary (say) that is probably not problematic, but if they formulated a striking new scientific theory then it would be wrong not to mention them. With many shades of grey in between. NBeale (talk) 14:31, 23 May 2009 (UTC)
- So I'd suggest we change that sentence to read "Wikipedia needs proper sourcing and attribution that is adequate in the circumstances". NBeale (talk) 14:34, 23 May 2009 (UTC)
- I think your suggested phrasing would not be helpful because it is unclear, perhaps meaningless. What does "adequte in the circumstances" mean? So I prefer the current phrasing. About your assertion that "Readers of Wikipedia are perfectly aware that the concept of 'who actually authored the material' does not really apply to Wikipedia articles," I don't think that is right. I expect that most readers who come to wikipedia expect that material at the wikipedia website is worded by wikipedia editors, except where explicit quotes indicate that wording is taken from some other explicit source. Because this is a basic minimum of what English language readers should expect of any source, whether it be a textbook, a newspaper article, or whatever. doncram (talk) 10:39, 24 May 2009 (UTC)
- I agree with Doncram. --Moonriddengirl (talk) 11:01, 24 May 2009 (UTC)
- Hi Doncram. Do you have any evidence that this is true of users of Wikipedia? After all Wikipedia rubbed along until this month without a guideline on Plagarism. Do you think that users expected that your guideline would be followed even before it had been promulgated. Also to say "the material is worded by wikipedia editors" is playing with words. In most cases the user has no idea of who these Wikipedia editors are (I am a fairly rare exception). Furthermore in many cases there will be sentences in a wikipedia article that have not been written by any one editor. Wikipedia articles are not a Textbook, a Newspaper article an Academic Work, or indeed an Original Work or any sort. What problem do you and the 26 editors who think this should be guideline believe you are trying to solve? NBeale (talk) 20:56, 24 May 2009 (UTC)
- Some comments: It's not playing with words to say that wikipedia is written/edited by its editors. I think it is helpful to consider this a collective of editors. Also, the group of editors present is a lot more representative of the core of active writers in wikipedia than you wish to believe. By the active core, I mean the ones who have developed 2500 featured articles, many good-rated articles and very many other good articles all with care to attribute credit where due, appropriately. Writing is what most writers in wikipedia do, and they do it for the most part consistently with general guidelines for writing anywhere. Historically, there certainly have been big campaigns to develop the wikipedia by pasting in material, such as for the 1911 Encyclopedia Britannica material, but since then there have been big campaigns to clean that material up. The GA and FA review processes are part of that. It has gradually become better understood that pasting in material with inadequate attribution is very costly of later editors' time and patience, and wikipedia's formal policies need to evolve. This plagiarism guideline is part of that, seeking to formalize policy that most have been operating by, already, and also to guide how raw material is added into wikipedia going forward. Certainly dealing with long-ago pasted in material is to be addressed differently than new paste-ins. doncram (talk) 21:53, 24 May 2009 (UTC)
- I think your suggested phrasing would not be helpful because it is unclear, perhaps meaningless. What does "adequte in the circumstances" mean? So I prefer the current phrasing. About your assertion that "Readers of Wikipedia are perfectly aware that the concept of 'who actually authored the material' does not really apply to Wikipedia articles," I don't think that is right. I expect that most readers who come to wikipedia expect that material at the wikipedia website is worded by wikipedia editors, except where explicit quotes indicate that wording is taken from some other explicit source. Because this is a basic minimum of what English language readers should expect of any source, whether it be a textbook, a newspaper article, or whatever. doncram (talk) 10:39, 24 May 2009 (UTC)
- See [1]. This is both copyvio and plagiarism; articles should not be written this way. That is not to say that this guideline is perfect. JN466 21:10, 24 May 2009 (UTC)
- Except, of course, insofar as the whole "blockable offense" thing at WP:CP is concerned. When Fox News views plagiarism on Wikipedia as worthy of reporting, we have pretty good evidence that the public expects Wikipedia not to plagiarize. So does Jimbo Wales, according to what he told the press in 2006: that when we find a contributor who has plagiarized, we check every contribution that this contributor has made. To quote that Associated Press article, "Wales said plagiarism is always possible in a site that offers “wide-open editing ... but in general we take a very strong anti-plagiarism stance.” Any time plagiarism is brought to the site's attention, he said, Wikipedia administrators review all postings made by that author."[2]. That plagiarism is a concern to Wikipedia, one to be taken seriously and on which we should have a strong stance is by no means a new concept. Even before this particular scandal, Wales had a strong stance on plagiarism, saying almost a full year prior that "There is no need nor intention to be vindictive, but at the same time, we can not tolerate plagiarism. Let me say quite firmly that for me, the legal issues are important, but far far far more important are the moral issues. We want to be able, all of us, to point at Wikipedia and say: we made it ourselves, fair and square."(ANI, 12/2005) and "We need to deal with such activities with absolute harshness, no mercy, because this kind of plagiarism is 100% at odds with all of our core principles. All admins are invited to block any and all similar users on sight. Be bold. If someone takes you to ArbCom over it, have no fear. We must not tolerate plagiarism in the least." (AN, 12/2005). See also [3] for some of the other discussions on the two primary administrators' noticeboards. --Moonriddengirl (talk) 21:46, 24 May 2009 (UTC)
Enormous respect to Jimmy Wales but this was nearly 4 years ago - long before this plagarism guideline was developed - and it seems that what was being discussed then was copyright violation. NBeale (talk) 12:53, 30 May 2009 (UTC)
- You shall have to take that up with Jimmy Wales. Plagiarism and copyright infringement often go hand in hand, and many people do use the terms interchangeably. But whatever terminology Wales adopts, his language pretty clearly represents a different view than that you espouse with "Readers of Wikipedia are perfectly aware that the concept of 'who actually authored the material' does not really apply to Wikipedia articles." Wales, at least, wants to be able to say that we made it ourselves. In any event, you asked, "What problem do you and the 26 editors who think this should be guideline believe you are trying to solve?" There you go. That problem. It isn't new, and even if this guideline is only about a year old (if recently promoted), plagiarism has been addressed in Wikipedia policy for several years at WP:CP and in Wikipedia practice for many. --Moonriddengirl (talk) 13:15, 30 May 2009 (UTC)
- There is a legitimate point within what NBeale is saying. It's hard for me to pick that out, because NBeale has been throwing up so many irrelevant, invalid arguments as well (invalid in my view, e.g. it seems to me that NBeale incorrectly asserts IP editors opinions about plagiarism are to be dismissed, that non-lawyers are to be dismissed, that only 26 persons want any guideline at all to express wikipedia practices formally, that his RL credentials matter, that there has never been any thought about plagiarism before in wikipedia, that "everyone" believes wikipedia is all pasted-in rubbish anyhow). The legitimate point that i see is that our understanding of plagiarism has evolved through discussion (mine certainly has), and the substance of what the plagiarism guideline says has been pushing along a bit further than what some would recognize. There is wide consensus that there should be a policy, some policy. But, for example, there is tension between many editors views in previous discussions (the RfC type discussions at wp:Cite, cited previously here, which are themselves here and here) about the suitability of paste-in plus generic disclaimer template. There are extreme pro-paste-in views that are just wrong, out of synch with Jimmy Wales and the broad consensus of editors. There are centralist views, which Franamax here has represented, that paste-in is okay if proper attribution is given (me: but what is proper attribution?). There are more extreme academic-like views that paste-ins which are not put into quotes or blockquotes are morally heinous, and/or which completely conflate copyvio and plagiarism. (My current view is that the paste-ins of PD or GFDL text without using quotes are not terribly wrong morally-wise, but they are hugely unhelpful, practically, to later editors who have to try to sort out what is the source for what, when cleaning up and developing the articles further.) I think my own occasional edits in the plagiarism guideline have pushed towards "outlawing" the use of paste-ins plus attribution templates (or at least allowing or promoting more specific attribution practices), and that is the only correct way to go overall.
- However, I think this does get beyond what Jimmy Wales was talking about. And, it does tend towards saying that previously acceptable/honored practices in wikipedia are now to be outlawed. I think we should formulate some new RfC and publicise it widely, about a yet-to-be-written proposal that actually changes attribution / paste-in guidelines. I think NBeale's specific arguments and some of his specific practices would not be covered by this, but NBeale's general perception that there has been some movement is not wrong. doncram (talk) 22:22, 30 May 2009 (UTC)
- This was a very sensible and well thought out analysis of the situation, so kudos for that. Still, I have to disagree when you say that "I think my own occasional edits in the plagiarism guideline have pushed towards "outlawing" the use of paste-ins plus attribution templates (or at least allowing or promoting more specific attribution practices), and that is the only correct way to go overall." While your edits have certainly done that, I cannot agree that it is the only way to go. I personally believe that an attribution template together with an edit summary note and proper citation (to fulfill our verifiability policy) are more than enough to take care of any possible plagiarism concerns. Also, I don’t share your concerns about problems when further developing articles, to me that’s really a non issue. 189.105.3.53 (talk) 23:05, 30 May 2009 (UTC)
- Doncram - thanks for all this, but FWIW I do not hold the views you attributed to me. I don't think anyone's opinion is to be dismised, nor that Wikipedia is pasted in rubbish, nor that there has never been any thought about X. I do think that a 26/6 vote is insufficient to assert that a guideline is "generally accepted" and that we cannot and should not act as if Wikipedia articles were academic papers. I don't think "ownership" and "claiming credit for" Wikipedia articles are sensible concepts, and that we need to apply our policies with common sense. NBeale (talk) 06:34, 31 May 2009 (UTC)
Plagiarism by not citing sources
I think the following sentence disrupts the flow of the lead paragraph: "Plagiarism includes not citing sources for material beyond what is considered common knowledge within a specific field." I think it should be moved to a less prominent position in this guideline for two reasons:
- This is a non-issue on Wikipedia, because the verifiability policy goes far beyond normal academic standards in demanding citations.
- The idea is slightly more subtle: you are expected to cite the right source. If you discuss a certain concept, you cannot only cite your own recent fabulous treatment of it, but you have to cite the definite reference as well.
I suggest moving it into the definitions of plagiarism section. CopoCop (talk) 00:42, 25 May 2009 (UTC)
- Quick reply to myself. Actually, it doesn't really disrupt the flow; I just don't like it. For the above reasons. CopoCop (talk) 00:47, 25 May 2009 (UTC)
I've removed
Plagiarism occurs when anything other than "common knowledge within a specific field"[1] is used without citing the specific source. Plagiarized content can include facts, wording, ideas, and structure.
because:
- WP:V is already already more demanding about citations, as it requires sources for anything that is not common knowledge irrespective of field, i.e. that is not known to non-specialists. WP:V is a policy and therefore overrides any guideline such as Wikipedia:Plagiarism.
- Such duplication of instructions is likely to cause disputes over interpretation becuase of slight differences in the wording, and subsequent editing may introduce actual inconsistencies.
- "Plagiarized content can include facts, wording, ideas, and structure" is - well, crazy:
- We cannot include facts unless thay are presented in good sources, per WP:V. "Plagiarized content can include facts ..." would imply that we also cannot include facts that are presented in good sources. If this is accepted, we can't present anything at all, and WP should shut down.
- We cannot include ideas unless thay are presented in good sources, per WP:V. "Plagiarized content can include ... ideas ..." would imply that we also cannot include ideas that are presented in good sources. If this is accepted, we would have have only two options: exclude all explanations and implications of facts presented, and all theories, hypotheses, and conjectures, and all statements that are considered facts by specialists but which are not directly visible to the senses; or give up and walk away from WP. Excluding all statements except facts that are directly visible to the senses would eliminate most topics in science, history, literature and a host of other subjects. --Philcha (talk) 07:41, 31 May 2009 (UTC)
Some confusion and its implications for this guideline
Hi there all. A recent experience has me wondering whether this guideline perhaps needs to be clarified. There seems to be a fair amount of confusion on what "proper attribution" means; some editors seem to believe (wrongly, IMO) that this constitutes only citations, and not quotation marks. I know this is already discussed in the lead, but I think it might be useful to expand on this issue further in the body as a separate section. I'd advocate specifically calling out and expanding upon:
- Plagiarism – ideas Includes not properly citing sources for ideas
- Plagiarism – wording A subset of the above. Includes not properly attributing wording by not using quotation marks. These can be copyright violations if the sources are copyrighted. Of course then there's the issue of copyright violations that are not plagiarism. I.e. you can paste in an entire New York Times article, cite it, and put quotation marks around it. This wouldn't be plagiarism, but it would almost certainly be a copyright violation.
Just some food for thought. I'm unclear how expanding on the above would fit into the logical flow of this guideline though. BuddingJournalist 06:51, 31 May 2009 (UTC)
- I've just penned an essay at User:TwilligToves/On Plagiarism fleshing out my thoughts on this subject.Comments and criticisms welcome. The proposed section I described above would almost certainly have to be a more succinct version than what I laid out in the essay though. TwilligToves (talk [formerly BuddingJournalist]) 09:31, 31 May 2009 (UTC)
Remove guideline status until content agreed
The RfC on this topic explicitly said "The results of this RFC should be considered support for a Plagiarism guideline in general, not for a particular version of this page during or at the end of the RFC". In other words there is no community consensus that any version of Wikipedia:Plagiarism between 24 April 2009 and the present is the desired guideline.
The current content of Wikipedia:Plagiarism has some severe defects (even after I removed two sentences that would, if taken literally, be disastrous):
- lack of clarity about what meaning of "plagiarism" is intended. This affects how high the bar is set, and IMO the bar is currently set far too high:
- In academic literature, the main aim is to avoid priority wars. The focus is on discoveries, theories, hypotheses, conjectures, interpretations, wetc. - i.e. content - rather than phrasing. However this is not applicable to WP, as academics are meant to do original research, which is forbidden on WP.
- In work submitted by students, the main aim is to make sure each student thinks for himself / herself, that grades accurately reflect the ability and industry of the student. There the bar is intentionally set very high, so that it includes expression as well as content, and forbids straight copying from the bright kid next door as well as from authoritative sources. Even with such well-defined objectives, there is no agreed definition of that constitutes plagiarism, not even in law schools, where one would expect rigourous definition to be most likely - see Can law students get away with plagiarism? (Times Online; October 15, 2008), which cites Plagiarism in UK law schools: is there a postcode lottery? (Bermingham et al, Assessment & Evaluation in Higher Education, Jan 2009); note especially "Staff had difficulty in defining plagiarism and the difference between what counted as a major and a minor offence" and the following paras. The problem is by no means new - see for example Perspectives on plagiarism and intellectual property in a postmodern world, ch "Confusion and Conflict about Plagiarism in Law Schools and Law Practice", pp. 195ff (1999). The lack of progress in the last 9-10 years indicates that plagiarism by students is a very difficult issue. Grading of coursework is not an objective of WP, so whatever standard is agreed, if ever, for students' assignments is irrelevant to WP.
- Legal, in order to reduce the risk of lawsuits against WP. This is very complex, see for example Julie Hilden's A Legal Remedy for Plagiarism? Rethinking The Ambrose And Goodwin Plagiarism Scandals. Issues considered included:
- Monetary theft. The article suggests that scooping an unpublished or little-known work may do monetary damage to the original author, but the impact on established sources is complex and ambiguous, for exdmaple "plagiarized usage that is nevertheless properly footnoted may even increase the value of the prior work".
- The issue of deceiving the reader is even less clear. Hilden comments on ghost-writing and says, "any reader who believes in the first place that every word he or she is reading came from his or her beloved author's pen is naive". Since WP openly says it is collaborative, no sensible reader will believe that the content of any WP article is the work of any specific person or persons.
- Elitism. The most conspicuous single instance is "Avoiding plagiarism requires mastery of citation and paraphrasing." The charge of elitism is no mere fancy on my part. Rebecca Howard discusses the point at length in Perspectives on plagiarism and intellectual property in a postmodern world, ch "The New Abolitionism comes to Plagiarism" pp. 87ff, basing her comments on Susan Miller's insistence that "the function of composition pedagogy is to perpetuate an intellectual class system". Such an approach may be at least partially justified in academia, where grading of students' work is important. However it has no place in WP, as it is likely to drive away the next generation of editors, causing WP to stagnate and die.
- The "how to detect plagiarism" advice is questionable. For example "Plagiarized text usually demonstrates a sudden change of style and tone from a writer's usual style; it is often more advanced in grammar and vocabulary" has two defects:
- WP is collaborative, and a sudden change is much more likely to reflect the contribution of a different editor. And, believe it or not, some editors can consciously use more than one style. For example in Howard Staunton I deliberately changed style at the end of a sentence, for ironic effect ("The London Chess Club, which had fallen out with Staunton and his colleagues, organized a tournament that was played a month later and had a multi-national set of players (many of whom had competed in Staunton's tournament), and the result was the same - Anderssen won").
- Who is to decide whether a passage is more advanced in grammar and vocabulary? By what citeria?
- Following Wikipedia:Make technical articles accessible, WP:NOT PAPERS and Wikipedia:Explain jargon will often cause changes in grammar and vocabulary, as the article moves from explanation of fundamental concepts in simple language to use of these concepts in more concise language.
- For something that purports to be a guideline, the current content of Wikipedia:Plagiarism offers a depressing catalogue of "thou shalt nots" but hardly any helpful advice, and its overall effect is to deter inexperienced or unconfident editors. Compare this with, for example, WP:RS and WP:MOS, which, for all their many faults, offer plenty of advice that can be understood and acted on by inexperienced or unconfident editors. In short, Wikipedia:Plagiarism currently fails to perform the functions of a guideline.
I could go on, and on, and on, ... Instead I'll cut to the chase:
- The RfC established consensus for the principle of a guideline on plagiarism, but not for any version of its content.
- The tone of the guideline is elitist and unhelpful and its content fails to perform the functions of a guideline.
- It is grossly under-researched and poorly thought out.
I am therefore removed the "guideline banner" from the top of Wikipedia:Plagiarism.
To make it a usable guideline, I suggest:
- Define "plagiarism" in the sense that WP editors should be concerned about. It may help to clarify this by including examples of what might be considered plagiarism in other contexts but not for WP's purposes.
- Do some serious research, discuss the relevance implications of sources in this Talk page, and cite sources in the text of any version that is proposed as a working guideline. This is a complex subject, and any guideline on it needs to meet the standards of at least a good Good Article as well as demonstrate consensus.
- Think hard about the structure and presentation. For example it may be useful to start with an overview section that provides simply-expressed advice that covers over 90% of cases, and then get into the details.
- Start another RfC on the content, to ensure that the WP community agrees with the content and finds it intelligible and helpful. --Philcha (talk) 09:42, 31 May 2009 (UTC)
- Agree with removal of guidelines status for now. This warrants further discussion and wider input. JN466 10:16, 31 May 2009 (UTC)
- I do not agree. Start another RfC on the content if you please, but the RfC was properly publicized, the guideline was tagged as proposed for quite a long time, and I believe the same consensus process that was used to promote would be necessary to demote it. In other words, if demotion is necessary, I think that needs an RfC of its own. (I have restored the banner.) --Moonriddengirl (talk) 11:03, 31 May 2009 (UTC)
- +1 to Moonriddengirl's point. I am not satisfied with the current version of the guideline, and if an RfC were to be opened on the current version, I would probably oppose it as a guideline. However, the current version was promoted to a guideline by consensus; to remove this status necessitates gaining consensus as well, and thus another RfC. TwilligToves (talk) 11:12, 31 May 2009 (UTC)
- The RfC on this topic explicitly said, "The results of this RFC should be considered support for a Plagiarism guideline in general, not for a particular version of this page during or at the end of the RFC" --Philcha (talk) 11:25, 31 May 2009 (UTC)
- The RfC question (what people were !voting on) was "Promote this to guideline?" Consensus was "yes." The note you are quoting was added nearly a month later. --Moonriddengirl (talk) 11:32, 31 May 2009 (UTC)
- (e.c.) Huh. Well in that case, I think it might be good if Centrx were to comment on what s/he meant, and whether the guideline tag was actually mistakenly applied here. The text you're pointing out was added a month after the opening of the RFC. Did the users who !voted understand what they were actually supporting/opposing? Seems like most people took the RfC to mean promotion of a version similar to the one they read when they !voted, and either "supported" promotion, or "opposed" (many on this camp were open to some sort of guideline though, just not this one). And looking over the arguments and the !votes, it seems like consensus were on the supports. But I could be wrong. TwilligToves (talk) 11:39, 31 May 2009 (UTC)
- The RfC on this topic explicitly said, "The results of this RFC should be considered support for a Plagiarism guideline in general, not for a particular version of this page during or at the end of the RFC" --Philcha (talk) 11:25, 31 May 2009 (UTC)
- The "guideline" banner template inludes the words "It is a generally accepted standard that editors should follow", i.e. claims its content reflects consensus. The RfC did not endorse the content. Jayen466 and I agree that Wikipedia:Plagiarism is not yet ready for guidelin status. TwilligToves said, "if an RfC were to be opened on the current version, I would probably oppose it as a guideline". At present what little consensus there is in this discussion indicates that Wikipedia:Plagiarism does not have the necessary support to be regarded as a guideline.
- I am therefore adding {{disputedtag}} to the top of Wikipedia:Plagiarism per Wikipedia:Guideline#Proposing_change_to_guideline_or_policy_status --Philcha (talk) 11:41, 31 May 2009 (UTC)
- I have no problems with the disputed tag. The contents can use work. The contents of WP:CSD are revised all the time, and that's policy. More consensus and time is necessary to demote a guideline that has gone through proper procedures for implementation. However, I'm not sure why you duplicate your note about what the comment "explicitly said." That comment, as pointed out above, was added later and by another editor, not the one who closed the RfC. --Moonriddengirl (talk) 11:49, 31 May 2009 (UTC)
- The comment "The results of this RFC should be considered support for a Plagiarism guideline in general, not for a particular version of this page during or at the end of the RFC" was not opposed in any way. IMO it is a fair summary of the discussion, since a significant number of the "support" votes included commments like "Should Wikipedia have a guideline on plagiarism? Yes. Should this be it? No." According to one "support" voter, Franamax "took the lead on this in June 2008" - but in the RfC Franamax wrote "Strike my lack of objection, since I'm not happy with the extant text. I'm no longer convinced that this can be furthered with more success as an existing guideline, better to improve as a proposal and resubmit" (03:55, 29 April 2009). --Philcha (talk) 12:24, 31 May 2009 (UTC)
- My confusion was why you duplicated the comment (at 11:25 today; again at 11:41). I'm not sure if I should also duplicate my response to it. That one support voter changed his mind doesn't alter the fact that the majority didn't. --Moonriddengirl (talk) 12:28, 31 May 2009 (UTC)
- Noo, I support the promotion to guideline. [4]. The text has gotten a lot better since my oppose remark, and we need this as a reference for the people who think there's no such thing as plagiarism Franamax (talk) 16:24, 31 May 2009 (UTC)
- Missed that, as it was not actually part of the RfC discussion. I think the wording of [your diff is significant - "Despite my misgivings about the wording, I agree that it's time to give this some weight and work out the details over time." To me that looks like "let's publish a gideline that's not fully baked." Editors who have been around for a while can handle that, if necessary by pointing to the difference between policies and guidelines. However new editors will just be totally confused, and a lot of the discussion on this Talk page has centred round inexperienced editors. Wikipedia:Plagiarism currently does not function as a guideline. --Philcha (talk) 16:51, 31 May 2009 (UTC)
- The comment "The results of this RFC should be considered support for a Plagiarism guideline in general, not for a particular version of this page during or at the end of the RFC" was not opposed in any way. IMO it is a fair summary of the discussion, since a significant number of the "support" votes included commments like "Should Wikipedia have a guideline on plagiarism? Yes. Should this be it? No." According to one "support" voter, Franamax "took the lead on this in June 2008" - but in the RfC Franamax wrote "Strike my lack of objection, since I'm not happy with the extant text. I'm no longer convinced that this can be furthered with more success as an existing guideline, better to improve as a proposal and resubmit" (03:55, 29 April 2009). --Philcha (talk) 12:24, 31 May 2009 (UTC)
- I have no problems with the disputed tag. The contents can use work. The contents of WP:CSD are revised all the time, and that's policy. More consensus and time is necessary to demote a guideline that has gone through proper procedures for implementation. However, I'm not sure why you duplicate your note about what the comment "explicitly said." That comment, as pointed out above, was added later and by another editor, not the one who closed the RfC. --Moonriddengirl (talk) 11:49, 31 May 2009 (UTC)
I don't think that we should allow minor quabbles over fine points to distract us from the fact that this policy (and it should be a policy, not a guideline) is badly needed. --Piotr Konieczny aka Prokonsul Piotrus| talk 17:18, 6 June 2009 (UTC)
- Amen to that ;) EyeSerenetalk 14:23, 10 June 2009 (UTC)
- agreed — Ched : ? 14:30, 10 June 2009 (UTC)
- I'd imagine I could expand a bit on that. I think the "disputed" tag should be removed. I think this should be a guideline, if not outright policy. I think any less only confuses matters, and makes it more difficult for newer editors to understand that plagiarism is something that should be avoided and removed. Policy and guideline pages have discussions and tweaks added to them all the time, look at any talk page of almost any guideline or policy and you will see ongoing discussion of the wording. That doesn't mean the guideline is in dispute, only that people continually work to improve wording, and clarify the meaning of said guideline. — Ched : ? 14:41, 10 June 2009 (UTC)
- The I suggest those in favour should develop a version that gets consensus here, then from a wider audience at an RfC. --Philcha (talk) 14:47, 10 June 2009 (UTC)
- We've already gotten consensus for the "guideline" status at RfC. At this point, the burden for consensus is on those wishing to removing it. If further discussion does not occur on the dispute, then removal of the "disputed" tag is fully appropriate. --Moonriddengirl (talk) 15:41, 10 June 2009 (UTC)
- The I suggest those in favour should develop a version that gets consensus here, then from a wider audience at an RfC. --Philcha (talk) 14:47, 10 June 2009 (UTC)
- It's unfortunate that the RfC was set up so badly. The structure should have been: (A) whether we need a guideline at all; (A 1) needed (numbered list for ease of counting); (A 2) not needed (numbered list); (A 3) neutral (numbered list); (A 4) comments (bullet list); (B) whether the then current version should be a guideline ; (B 1) yes (numbered list); (B 2) no (numbered list); (B 3) neutral (numbered list); (B 4) comments (bullet list). Apart from the mechanics of counting, the flaws in the RfC are: uncertainty about whether supports or opposes are for only the principle or for both principle and content; I see at least 2 items labelled "comment" whose content I interpret as "support principle, oppose current version" (Kaldari, 18:27, 29 April 2009; Franamax 03:55, 29 April 2009). In addition doncram's positon appears to have swung during the RfC from "both principle and content" to "support principle, oppose / neutral about current version (09:51, 29 April 2009).
- The only sensible course is to discuss all criticisms offered both here and at the RfC, improve the text and then start a new RfC with a structure that makes the propsitions and "votes" clear. --Philcha (talk) 16:26, 10 June 2009 (UTC)
- If you want to propose changes, reach consensus for them, and launch a second RfC to confirm, that's great (not than an RfC is necessary to change a guideline; even policies are frequently revised without them when consensus is clear). But the first one is closed, and pending clear consensus to demote, this is a guideline. The challenge itself doesn't demote it, and the label will not remain perpetually lacking any active, ongoing discussion. Otherwise, any editor could effectively demote any guideline or policy in this place simply by challenging it. --Moonriddengirl (talk) 16:52, 10 June 2009 (UTC)
- Seconded. Philcha, with all due respect, what you're saying largely adds up to "why didn't someone tell me about this before now?" Unfortunately, that's not how a wiki tends to work. People work in the areas which catch their interest and I don't mean to be sarcastic in saying I'm sorry that you didn't look at this guideline until now. Lots of other editors have, there are pages of archived discussions, the proposed guideline has been brought up at WP:ANI quite a few times over the last ten months and really, the overwhelming consensus is that we need this guideline.
- If you have specific issues, please do raise them here. I have a few quibbles myself, but those are subjects for further discussion, not asploding the entire page. Can we find a way to move forward on this? Franamax (talk) 05:58, 11 June 2009 (UTC)
- If you want to propose changes, reach consensus for them, and launch a second RfC to confirm, that's great (not than an RfC is necessary to change a guideline; even policies are frequently revised without them when consensus is clear). But the first one is closed, and pending clear consensus to demote, this is a guideline. The challenge itself doesn't demote it, and the label will not remain perpetually lacking any active, ongoing discussion. Otherwise, any editor could effectively demote any guideline or policy in this place simply by challenging it. --Moonriddengirl (talk) 16:52, 10 June 2009 (UTC)
Editing stuff in quotation marks
I have a question about the sentence "work marked as a quotation or paraphrase of another source (which can be edited as long as the original sense is not lost)" My understanding was that if something was a quote one could add {{sic}} but not correct it unless the typo was clearly a translation error. ϢereSpielChequers 15:24, 23 May 2009 (UTC)
- Thanks for pointing that out. I agree that it is confusing and/or not written clearly. I think it should be stated clearly what is acceptable to do within quotations, which I think should include adding {{sic}} type corrections and also adding wikilinks. Also it is acceptable to reword material that was covered by a quotation and then remove the quotation (while retaining reference to source for the ideas). I took a stab just now at revising that passage. doncram (talk) 10:32, 24 May 2009 (UTC)
- We also need to allow for useful omissions, insertions, and substitutions to be made if they are clearly marked, such as in the quotes at J. R. R. Tolkien#Retirement and old age, Health risks of professional dance#Injuries, and Gnosticism#The_main_features_of_gnosticism (2nd quote box). The new text suggested these are not acceptable. I've tried to address this, but I think going into all the details here will be overkill. -- Avenue (talk) 16:00, 24 May 2009 (UTC)
- I think it is fine to rephrase a quotation and remove the quotation marks, if it makes sense to do so. For example, a quote from the 1911 Britannica may be in too antiquated language. JN466 21:12, 24 May 2009 (UTC)
- Agreed that it's fine to rephrase a straight quote in certain situations. Also there are conventions for noting typos in originals and for inserting missing context. That gets into manual of style issues, though. DurovaCharge! 07:35, 25 May 2009 (UTC)
- And another example would be when the 1911EB information is just plain wrong, as proved by a newer source (and the same applies to copyright text in quotes also). The quoted text then should be either immediately contradicted in plaintext, or the quotes removed and the passage rewritten to combine the two sources. This is discussed a bit at Requirement to rewrite.
- Also I agree with Avenue on "useful" O,I&S's to be made, provided they are clearly marked and truly useful. Quoting text that said "This is clearly not a situation where an accusation of racism could be made." as "This is clearly [...] racism" would be a silly example of non-useful use of ellipsis. As Durova notes though, this is more of a style (and in my silly example, honesty) issue. Surely we already have some verbiage on how to properly incorporate text within quote marks? Indeed, we have WP:MOSQUOTE. Franamax (talk) 23:56, 25 May 2009 (UTC)
- If a statement from 1911EB is in quotes it should not be altered and should be word for word the same as 1911EB. If it is not within quotes (eg it forms part of the narrative voice of the article) then errors and archaic language should be fixed (with another cited source if the correction to the "error" is likely to be challenged). --PBS (talk) 16:20, 17 June 2009 (UTC)
Wikipedia guidance for paste-in at new article creation
When creating or editing an article earlier, i noticed that the message above the edit window gave a warning about pasting in material, something to the effect that you are welcome to copy in material if it is not copyright restricted. To me, the message is too broadly welcoming for material to be pasted in without proper attribution.
Trying to get that message now by starting to create a new MYTEST article, however, I don't get that message. Instead now i get:
* Before creating an article, please read Wikipedia:Your first article, or search for an existing article to which you can redirect this title. * To experiment, please use the sandbox. * When creating an article, provide references to reliable published sources. An article without references may quickly be deleted. * You can also start your new article at Special:MyPage/MYTEST. You can develop the article, with less risk of deletion, ask other editors to help work on it, and move it into "article space" when it is ready.
Perhaps the message about paste-in is mixed in with others put up on a rotating basis, or perhaps it is context-specific somehow.
Can anyone else identify what is the message on paste-in that is given? Where, technically, are those messages handled? And, then I think we should work on a new guideline on paste-ins, probably to be located somewhere else (as not strictly a plagiarism issue), and a substitute message. doncram (talk) 20:05, 28 May 2009 (UTC)
- Yeah, definitely, there used to be something about copying in text, like one day ago when I was fooling around with page creation related to this VPT thread. You can find all (or most) of the system messages at Special:AllMessages (warning - it's a huge page). The new article notice is at MediaWiki:Newarticletext. All (or most of) the actual message texts are in the Mediawiki: space. What happened to that copy-warning is a bit of a mystery. We'll have to track down the particular system message and check its history to see if something changed recently. Franamax (talk) 22:51, 28 May 2009 (UTC)
- There's a message during any edit of any existing article, right below the Save button: Do not copy text from other websites without a GFDL-compatible license. It will be deleted. It always sounded ungrammatical to me; I am not sure if text is a singular or plural noun, or whether the license would be deleted, or what. But, I think the message should be modified to link to some policy on copy-pasting, not to the GFDL license. Also, it just seems incorrect to say you can copy GFDL text while implying that copying of public domain text is unacceptable. I think a specific short guideline about how to do copy-pasting is needed. doncram (talk) 15:38, 7 June 2009 (UTC)
- I'm inclined to suggest that the message is too detailed. While not strictly, correct, the message probably ought to say, "Do not copy text from other websites. It will be deleted." Period. In general, copy-pasting is done incorrectly and should be avoided except by editors who understand very clearly what they are doing.
- The vast bulk of copy-pastes into Wikipedia are out-and-out copyright violations. Even in the (relatively rare) cases where a copy-paste comes from a GFDL-licensed source, clear identification of the authorship is the exception and not the norm — failing to comply with the terms of the GFDL takes us right back into copyvio territory. What's left as non-copyvio copying is often plagiarism of public domain sources, along with de minimus misappropriation of short passages of copyrighted works.
- The few editors who know who to incorporate external material into a published work – using no more material than necessary, with quotation marks and footnotes – are also sufficiently qualified to know when to ignore the 'do not copy text' provision. That said, if a suitable guideline for copy & paste is forthcoming, it would be a reasonable reference to link to in the future. TenOfAllTrades(talk) 16:19, 7 June 2009 (UTC)
- I have no idea if it was influenced by our discussion here, but the message below the edit window has now changed to the following (although links may go different places than where i am marking wikilinks):
Content that violates any copyrights will be deleted. Encyclopedic content must be verifiable. By saving, you agree to irrevocably release your contribution under the Creative Commons Attribution/Share-Alike License 3.0 and the GNU Free Documentation License. Re-users will be required to credit you in any medium, at minimum, through a hyperlink or URL to the article you are contributing to. See Terms of Use for details.
Remove disputed tag from article page
Should the "Disputed" tag be removed from the article page Wikipedia:Plagiarism.
Comments and discussion
- I believe that with the overwhelming number of policies, guidelines, and essay pages that a new user is exposed to, that the disputed tag furthers the confusion to an unexperienced Wikipedian. I find it possible if not likely that some new editors may view the disputed tag as an idea that would permit plagiarism, and I think it should be removed. That's not to say that the wording can't be improved or tweaked - that's exactly what these talk pages are for. — Ched : ? 05:39, 11 June 2009 (UTC)
Support removal of disputed tag
(I've numbered this for ease of use --Philcha (talk) 07:31, 11 June 2009 (UTC))
- Strong support per my statement above. — Ched : ? 05:37, 11 June 2009 (UTC)
- Since you failed to provide a "polls are evil" section, I shall drive my pig into this unworthy stall. MRG is correct though in her assertion that the promotion was legitimate and unless cogent and succinct arguments are put forth in support of the dispute and sustained, the "disputed" tag should perforce be removed. Franamax (talk) 06:03, 11 June 2009 (UTC)
- Support. The "disputed" tag reads "This page's designation as a policy or guideline is disputed or under discussion. Please see the relevant talk page discussion for further information." That message seems unhelpful. I think there is no valid dispute about whether the guideline is now a guideline (it is a guideline), and I think it is unhelpful to direct general readers to this Talk page, which provides no clear statement of any dispute. Philcha has some legitimate points about ways to improve the wording of the guideline, but disputing whether it is a guideline or not is distraction from stating and working on those. doncram (talk) 11:24, 11 June 2009 (UTC)
- Support. Unless there is active, ongoing discussion and efforts to resolve, then the tag is inappropriate. Every policy and guideline has flaws and every policy and guideline has those who oppose it. While its initial placement was appropriate, there must be some active effort to move forward with that view by the individual or individuals who dispute it. It doesn't hang around perpetually. --Moonriddengirl (talk) 11:41, 11 June 2009 (UTC)
- Well, sure. Any quibbles I might have with the extant document deal more with matters of style than with content. Given the previous endorsement of guideline status at RfC (whatever confusion there may be about it), and that demands to reject guideline status are coming from a vocal but ultimately very small minority, there's no need for the tag right now. TenOfAllTrades(talk) 13:27, 11 June 2009 (UTC)
- Support per above. 'Disputed' is for use in article space regarding factual information. DurovaCharge! 15:43, 11 June 2009 (UTC)
- Support. Nobody is saying this policy is perfect as it is, but than, all of our policies are slowly changing and evolving. This is good enough to join them. --Piotr Konieczny aka Prokonsul Piotrus| talk 17:13, 15 June 2009 (UTC)
- Support. If you want to contest the guideline status, take it to RfC, don't template war. MLauba (talk) 22:00, 19 June 2009 (UTC)
Oppose removal of disputed tag
- The April 2009 RfC was so badly designed that it's hard to see what, if anything, got consensus: some "votes" appear to be "support the principle but not the current content"; the content was significantly modified part-wat through the RfC. In addition the current version of this Talk page shows that the current version of Wikipedia:Plagiarism has many flaws and ambiguities. No answer has been given to criticisms that Wikipedia:Plagiarism fails to state its objectives, fails to define "plagiarism" (a fuzzy concept in the real world), fails to take account of the difficulties encountered by real-world lawyers when dealing with alleged plagiarism, is elitist, gives plenty of "thou shalt not"s but hardly any constructive guidance (especially on WP:NOR's insistence that we represent sources as accurately as possible); and its advice on "how to detect plagiarism" advice is questionable. If a national parliament introduced a law that was so half-baked, there would be chaos. --Philcha (talk) 07:30, 11 June 2009 (UTC)
- I believe I noted above that it used to have a definition of plagiarism. Now it does again: the one that it had when the guideline was proposed. As to your other problems, I find "fails to take account of the difficulties encountered by real-world lawyers when dealing with alleged plagiarism" a bit vague, but perhaps you can be more specific. --Moonriddengirl (talk) 11:46, 11 June 2009 (UTC)
- Check the sources in the previous thread. Laywers have not successfully defined plagiarism, both in commerce and for the unforgiving standards imposed on law-students - and it was the same 10 years ago. So it would be pure arrogance to think we can produce a fine-grained definition, and even a fairly successful result would be unintelligible to a non-lawyer. I've no objection to rejecting blatant plagiarism - by a curious coincidence I've recently objected to such blatant plagiarism while doing a GA review. But I think it's a big mistake to try to clamp down on less glaring cases. If you want me to outline a development programme I can do so in a separate thread.--Philcha (talk) 12:26, 11 June 2009 (UTC)
- I think, this being Wikipedia, we can define plagiarism to mean pretty much what we want it to mean. Most of our policies are fuzzy and open to interpretation anyway (take WP:CIVIL as the obvious example). As long as we come up with something that makes it clear that passing off someone else's work as your own is unacceptable, I don't see what the problem is. EyeSerenetalk 14:28, 11 June 2009 (UTC)
- I agree with User:EyeSerene. I don't think we need to worry about the definition of plagiarism applied to law students here any more than we necessarily need to worry about the styleguide requirements for students of psychology. We are Wikipedia, and we may define our own policies and guidelines. It would be pure arrogance for us to attempt to impose a universal standard. But it is no more arrogant for us to create our own definition than it is for the American Historical Association to adopt a standard of its own. This is "in house", as it were. I'm not sure why it would be a big mistake to clamp down on less glaring cases of plagiarism, but I'm also not sure what you mean by that. --Moonriddengirl (talk) 15:20, 11 June 2009 (UTC)
- OK, the current "guideline" does not define its objectives. What are they? --Philcha (talk) 15:53, 11 June 2009 (UTC)
- What do you think they are? :) (Note: it's not my intention to sound snarky. I am puzzled by the question. It seems to be pretty obvious to me from the title of the guideline and in the sentence, "This guideline addresses how to avoid plagiarism on Wikipedia and how to address it when it is encountered." There you go. Avoid plagiarism. Address it when it's encountered.) --Moonriddengirl (talk) 16:03, 11 June 2009 (UTC)
- "plagiarism" has multiple meanings, see my comments in prev thread. It might help if you gave some examples around the borderline and why you consider each to be a problem or not. --Philcha (talk) 17:25, 11 June 2009 (UTC)
- Plagiarism's multiple meanings don't signify if Wikipedia has adopted its own. There's already been a lot of protest that the guideline was bloated and much was removed, so I'm not sure that adding examples will meet with consensus. I believe it is generally considered best to keep it relatively tight. --Moonriddengirl (talk) 17:29, 11 June 2009 (UTC)
- Then please explain "When paraphrasing, they need to know how much they can and should retain without following too closely on source text." At present WP:PLAGIARISM give no guidance whatever on "how much they can and should retain". This is the core of the alleged guideline, and it gives no guidance. That is why I asked for discussion of borderline cases. --Philcha (talk)
- Some examples are supplied at Wikipedia:Close paraphrasing. There was some discussion about incorporating material on that here, but, again, the general perception seems to be that the guideline part needs to be kept trim. --Moonriddengirl (talk) 23:13, 11 June 2009 (UTC)
- Then please explain "When paraphrasing, they need to know how much they can and should retain without following too closely on source text." At present WP:PLAGIARISM give no guidance whatever on "how much they can and should retain". This is the core of the alleged guideline, and it gives no guidance. That is why I asked for discussion of borderline cases. --Philcha (talk)
- Plagiarism's multiple meanings don't signify if Wikipedia has adopted its own. There's already been a lot of protest that the guideline was bloated and much was removed, so I'm not sure that adding examples will meet with consensus. I believe it is generally considered best to keep it relatively tight. --Moonriddengirl (talk) 17:29, 11 June 2009 (UTC)
- "plagiarism" has multiple meanings, see my comments in prev thread. It might help if you gave some examples around the borderline and why you consider each to be a problem or not. --Philcha (talk) 17:25, 11 June 2009 (UTC)
- What do you think they are? :) (Note: it's not my intention to sound snarky. I am puzzled by the question. It seems to be pretty obvious to me from the title of the guideline and in the sentence, "This guideline addresses how to avoid plagiarism on Wikipedia and how to address it when it is encountered." There you go. Avoid plagiarism. Address it when it's encountered.) --Moonriddengirl (talk) 16:03, 11 June 2009 (UTC)
- OK, the current "guideline" does not define its objectives. What are they? --Philcha (talk) 15:53, 11 June 2009 (UTC)
- Check the sources in the previous thread. Laywers have not successfully defined plagiarism, both in commerce and for the unforgiving standards imposed on law-students - and it was the same 10 years ago. So it would be pure arrogance to think we can produce a fine-grained definition, and even a fairly successful result would be unintelligible to a non-lawyer. I've no objection to rejecting blatant plagiarism - by a curious coincidence I've recently objected to such blatant plagiarism while doing a GA review. But I think it's a big mistake to try to clamp down on less glaring cases. If you want me to outline a development programme I can do so in a separate thread.--Philcha (talk) 12:26, 11 June 2009 (UTC)
- I believe I noted above that it used to have a definition of plagiarism. Now it does again: the one that it had when the guideline was proposed. As to your other problems, I find "fails to take account of the difficulties encountered by real-world lawyers when dealing with alleged plagiarism" a bit vague, but perhaps you can be more specific. --Moonriddengirl (talk) 11:46, 11 June 2009 (UTC)
- I agree that new users are forced to battle through far too many pages of policies and guidelines. This is actually a major complaint from interested users who would like to become editors but are put off by the bureaucracy. The simple solution for this page is to delete it, as it serves no purpose. Nor even does it have the consensus support that it pretends. That way we could actually get back to writing an encyclopedia. Physchim62 (talk) 20:43, 11 June 2009 (UTC)
- A refreshing blast. But brand new users will not likely wade through this before they start an article, and they do not need to. But when they copy in material and other editors object that attribution is inadequate (my preferred definition of plagiarism: inadequate attribution), it will go much better to be able to point them to this guideline to guide refinement of their work. Rather than have their work deleted outright by indignant editors who can and will argue from first principles about what the wikipedia should include or not. It gets repetitive and boring to argue out the basics and the occasional nuances time and time again, that is what a guideline is for. doncram (talk) 21:16, 11 June 2009 (UTC)
- See my objection above - the alleged guideline gives no guidance, and does not help to define a boundary between what is and what is not acceptable, nor any principles by which editors can identify the boundary for themselves. --Philcha (talk) 22:34, 11 June 2009 (UTC)
- That's a partially valid criticism. I say "psrtially" because indeed, as you've noted, there is a massive grey area. WP:Close paraphrasing gives some examples and I believe is already linked. Similarly, a few of the university links have some good examples of good and un-good paraphrasing (but they may be overwhelmed by other sources). And of course, we walk the fine line between fidelity to the source, plagiarism and the WP:OR stricture. As noted above, just like in the case of WP:CIV, this is always going to be a judgement call. Our aim should always be to encourage our editors to completely avoid the whole issue, by either completely rewriting text; quoting it directly; or properly attributing direct copies of PD-text. I started a little thing on that at User:Franamax/Test essay - maybe you can help flesh it out (but also keep it short short short!) ? I agree that we need to give maximum guidance, especially to newer editors - where I disagree is whether we can pack every aspect of plagiarism into this guideline. Of necessity, we will need explanatory essays as well, it's just too complex a topic. Philcha, do you have concrete suggestions for improvement? What should come out of the guideline, what should go in? Will the end result be anywhere close to what an average editor is actually going to read? Franamax (talk) 23:20, 11 June 2009 (UTC)
- Franamax, your comments raise some issues:
- If "this is always going to be a judgement call", who appoints and recalls the judges? --Philcha (talk) 00:24, 12 June 2009 (UTC)
- At the end of April 2009 Wikipedia:Plagiarism did not link to WP:Close paraphrasing, but now it does. Wikipedia:Plagiarism has always failed to give straightforward guidance on where to draw the line and now relies on an essay that was not disclosed at the RfC. That looks more like bait and switch rather than seeking consensus. --Philcha (talk) 00:24, 12 June 2009 (UTC)
- Yes, I could make some concrete suggestions but you would not like them. --Philcha (talk) 00:24, 12 June 2009 (UTC)
- "...who appoints and recalls the judges?" The same people who appoint (and sit as) judges for enforcing just about any other Wikipedia policy, rule, guideline, or principle: Wikipedia editors (admins or not). Very few of our policies can be enforced mechanically, without requiring a judgement call. The appropriate interpretation of WP:NPOV, WP:POINT, WP:NOR, and WP:RS (to take just a few examples) are regular subjects of lively debate and their application requires both judgement and the acknowledgement of gray areas — but no one argues that we should cease being guided by these principles. Even WP:3RR and WP:NLT – which are about as close as we get to bright-line electric-fence rules – nevertheless prompt periodic, heated discussions about their use.
- Like it or not, deciding what is or is not plagiarism does take some interpretation. It isn't even possible to write a guideline which would draw a meaningful sharp line between acceptable and unacceptable conduct. Despite that, plagiarism has never been an acceptable way to contribute to Wikipedia, and judgements about it – including punishments for engaging in it – are regularly handed down. Some cases are clear-cut, and some lead to extensive discussion. Forbidding plagiarism is something that is already enforced; the purpose of this guideline is to set down in writing some of the things that 'everyone' already 'knows'.
- Please, if you have concrete suggestions, share them. 'Delete this guideline' won't cut it, though. TenOfAllTrades(talk) 00:59, 12 June 2009 (UTC)
- This discussion is getting off-topic for this section. To discuss who is to judge, and how to run some new problem resolution process to centralize judging and enforcement which now happens elsewhere, please extend #Plagiarism problems discussion section above or start a new section. I also welcome other concrete suggestions, but perhaps best in adding to other discussion threads above or in new sections. doncram (talk) 01:12, 12 June 2009 (UTC)
- Fair point doncram. Nevertheless, I've just added a paragraph attempting to reflect a little of the above. [5] Probably in the wrong place and probably with the wrong words, but that's my regular modus operandi anyway. :) Hmm, am I using a plural Latin phrase there? Franamax (talk) 01:57, 12 June 2009 (UTC)
- No, you're not. "Modus" is singular, and "operandi" is the genitive of the gerund verbal noun. --JN466 18:09, 14 June 2009 (UTC)
- Fair point doncram. Nevertheless, I've just added a paragraph attempting to reflect a little of the above. [5] Probably in the wrong place and probably with the wrong words, but that's my regular modus operandi anyway. :) Hmm, am I using a plural Latin phrase there? Franamax (talk) 01:57, 12 June 2009 (UTC)
- This discussion is getting off-topic for this section. To discuss who is to judge, and how to run some new problem resolution process to centralize judging and enforcement which now happens elsewhere, please extend #Plagiarism problems discussion section above or start a new section. I also welcome other concrete suggestions, but perhaps best in adding to other discussion threads above or in new sections. doncram (talk) 01:12, 12 June 2009 (UTC)
- Franamax, your comments raise some issues:
- That's a partially valid criticism. I say "psrtially" because indeed, as you've noted, there is a massive grey area. WP:Close paraphrasing gives some examples and I believe is already linked. Similarly, a few of the university links have some good examples of good and un-good paraphrasing (but they may be overwhelmed by other sources). And of course, we walk the fine line between fidelity to the source, plagiarism and the WP:OR stricture. As noted above, just like in the case of WP:CIV, this is always going to be a judgement call. Our aim should always be to encourage our editors to completely avoid the whole issue, by either completely rewriting text; quoting it directly; or properly attributing direct copies of PD-text. I started a little thing on that at User:Franamax/Test essay - maybe you can help flesh it out (but also keep it short short short!) ? I agree that we need to give maximum guidance, especially to newer editors - where I disagree is whether we can pack every aspect of plagiarism into this guideline. Of necessity, we will need explanatory essays as well, it's just too complex a topic. Philcha, do you have concrete suggestions for improvement? What should come out of the guideline, what should go in? Will the end result be anywhere close to what an average editor is actually going to read? Franamax (talk) 23:20, 11 June 2009 (UTC)
- See my objection above - the alleged guideline gives no guidance, and does not help to define a boundary between what is and what is not acceptable, nor any principles by which editors can identify the boundary for themselves. --Philcha (talk) 22:34, 11 June 2009 (UTC)
- A refreshing blast. But brand new users will not likely wade through this before they start an article, and they do not need to. But when they copy in material and other editors object that attribution is inadequate (my preferred definition of plagiarism: inadequate attribution), it will go much better to be able to point them to this guideline to guide refinement of their work. Rather than have their work deleted outright by indignant editors who can and will argue from first principles about what the wikipedia should include or not. It gets repetitive and boring to argue out the basics and the occasional nuances time and time again, that is what a guideline is for. doncram (talk) 21:16, 11 June 2009 (UTC)
- Oppose This is still not good enough to be a full guideline as it stands. [6] JN466 17:42, 14 June 2009 (UTC)
Oppose This is not good enough to be a full guideline as it stands. --PBS (talk) 13:32, 18 June 2009 (UTC)PBS (talk) 09:27, 1 July 2009 (UTC)
- Oppose. This guideline has too many problems currently. Kaldari (talk) 15:52, 19 June 2009 (UTC)
- Oppose until the lede makes it clear that copying with attribution and without quotes is acceptable. -Arch dude (talk) 23:19, 19 June 2009 (UTC)
Exemptions needed
Principles
Outside WP, some unattributed uses of phrases coined by someone else are not regarded as plagiarism. The most obvious are the many contributions of Shakespeare and various translations of the Bible. WP:PLAGIARISM totally fails to recognise these. I'll also present other cases where I think exemption is needed, with reasons. WP:PLAGIARISM should mention these exemptions prominently and high up the page, so that can find them as quickly as possible - this will save a lot of disputes. However the actual list of objections may be placed in a sub-page, since it may grow long as the "case law" develops - in that case the link to the sub-page must appear as a "main" item at the top of the section. --Philcha (talk) 14:57, 13 June 2009 (UTC)
- Exemptions are slightly addressed under Wikipedia:Plagiarism#What is not plagiarism, but I think you're right that common phrases is a major oversight (math is already addressed). I'll add something and the specific language can be ironed out as we go. (ETA: thinking of specific examples that might be used on Wikipedia is difficult.) --Moonriddengirl (talk) 12:23, 15 June 2009 (UTC)
- I've edited some in.
- There's nothing difficult about thinking of specific examples that might be used on Wikipedia - I do a fair bit of wirk on science articles, and am always on the look out for vivid phrases from sources to lighten up our articles - and thought careful about where attribution was needed, long before WP:PLAGIARISM was born. If you look at the next sub-section you'll see some good examples from paleontology and evolutionary biology. --Philcha (talk) 16:07, 19 June 2009 (UTC)
- Moonriddengirl, I think I understand the reasons for your recent edit, but I think it's in the wrong place. What would you think if the early part of the guideline were structured like this?
- Lead - this guideline explains what plagiarism is and how it differs from copyright violation. It also points out very briefly cases where other WP polices and guidelines woud require you to cite and possibly attribute, even if there is no concern about plagiarism. ...
- Definition and why plagiarism bad for WP.
- Cases where plagiarism is not a concern (but other polices and guidelines might be)
- etc --Philcha (talk) 17:01, 19 June 2009 (UTC)
- Moonriddengirl, I think I understand the reasons for your recent edit, but I think it's in the wrong place. What would you think if the early part of the guideline were structured like this?
Proposed exemptions
- Phrases that have entered the language, including the dialects used by sub-cultures, including but not limited to academic publications.
- Criterion for exemption: unattributed use 2 years ago in a publication from an organisation that is likely to have watchful editors & lawyers, e.g. academic, large newspapers & mags, books, and web pages controlled by such organisations. For example "dead clade walking" was coined in 2002 and now appears widely in paleontology literature. It appears to have been used with attribution around 2004, but appears without attribution by 2007, e.g. in P. D. Taylor (ed), Extinctions in the History of Life. Similar vivid phrases widely used in this field are "Lazarus taxon" (no idea when coined; used w/o attribution in 1998 article Neoguadalupia oregonensis New Species: Reappearance of a Permian Sponge Genus in the Upper Triassic Wallowa Terrane, Oregon) and "Elvis taxon" (coined by Erwin and Droser in 1993; used in 2002 Devonian history of diversity of the rugosan Cyathaxonia fauna and 2005 Recovery of gastropods in the Early Triassic).
- Reasonable ground for objection to exemption: reliable evidence of retraction of / apology for the unattributed use cited as grounds for exemption, or legal judgement given against the unattributed use. Note that publication of a mere claim of plagiarism by a third party should not be the basis of an objection, as it might have proved groundless. The burden of proof would lie on the objector.
- Phrases that are the simplest and most obvious way to present information. The obvious analogy is with patents, where an application may be refused monopoly protection if its content is considered obvious to a competent practicioner.
- Criterion for exemption: any proposed rephrasing must not increase the reading difficulty of the passage by any of the measures used by Dispenser's Readability Analyser, nor introduce any ambiguity nor any awkwardness in the flow between the rephrased passage and any sentences preceding or following it. The burden of proof would lie on the objector.
- Definitions of terms, theories, hypotheses, etc.
- In such cases I think accuracy and avoidance of WP:OR take priority - the most obvious cases are mathematical and legal terms, where absolute precision is required. However the exemption should point out that attribution may be advisable for other reasons, for example new definitions, hypotheses etc. may not yet be consensus in the relevant fields, and unattributed use often implies that quoted or paraphrased content is the mainstream view (see WP:UNDUE, WP:NPOV).
Please add other proposed exemptions as you find instances while editing or researching. --Philcha (talk) 14:57, 13 June 2009 (UTC)
Legal arguments given in a newspaper: Assume the newspaper report says
"Smith's defence lawyer asserted that Smith himself had been nowhere near the building and that he had not induced Jones and Reed to go there."
If we put any part of this in quotation marks, the reader can't tell if the verbatim is from the lawyer or the newspaper. Anything but a very close paraphrase will require interpretation and compromise precision. Adding "according to the XY Times" impairs the flow. In my view it is not plagiarism if we write, with an in-line citation to the XY Times,
"According to Smith's lawyer, Smith had not been anywhere near the building and had not induced Jones and Reed to go there."
I would consider it pointless and possibly dangerous verbal gymnastics to try to change these words. JN466 18:55, 14 June 2009 (UTC)
- I disagree. We are not paraphrasing Smith's lawyer, but the XY Times. Impairing the flow is in the eye of the beholder. --Moonriddengirl (talk) 12:41, 15 June 2009 (UTC)
Suggest we lose this sentence
"Definitions of plagiarism differ. A very basic, plain-spoken definition is offered by Ann Lathrop and Kathleen Foss in their 2000 guide Student Cheating and Plagiarism in the Internet Era: A Wake-up Call: "If you didn't think of it and write it all on your own, and you didn't cite (or write down) the sources where you found the ideas or words, it's probably plagiarism."[2]" This definition is profoundly inappropriate for Wikipedia. We do not want editors to "think of it and write it all on their own". We need to think more deeply about what plagiarism means in Wikipedia, given that no claim to produce original work is implied in our environment, and produce appropriate guidance. Definitions of plagiarism used in academia will not help us much. JN466 18:03, 14 June 2009 (UTC)
- Well, I'd counter that it is actually a pretty good definition and useful advice. The problem is the indefinite article, "it". (At least I think it's called that) If "it" refers to the idea, yes, WP:NOR comes into play. If "it" refers to the actual words being used to describe the idea, the definition is correct imo. And for most students, other than those writing their PhD thesis, the idea has already been had by someone else and the student has read it somewhere. For Wikipedia editors, the idea has definitely been had by someone else, what we are asking is that they use their own words to describe the idea.
- My own working definition, or words of advice, is: "which do you use more when you're working on an article, the keyboard or the mouse?" Franamax (talk) 19:34, 14 June 2009 (UTC)
- Ah well, I use Ctrl-C and Ctrl-V, so that wouldn't work for me. ;) But I think most readers would equate "it" to "the article" (or its content); certainly in the quote's original context, "it" would have referred to the entire work delivered, rather than the mere verbal formulation of the ideas expressed in it. Otherwise "it" is the neuter third-person pronoun, singular. The indefinite article is "a" or "an". JN466 19:56, 14 June 2009 (UTC)
- That applies only to Franamax' method of working - I use the k/b as much as possible, incl when copying and pasting. --Philcha (talk) 19:58, 14 June 2009 (UTC)
- It might be a good definition and good advice for a university environment, but you have still to explain how Wikipedia could possibly benefit from a tiny bunch of users setting themselves up as "course professors" on the look out for "plagiarism", rather than using their skills for improving our encyclopedic content. Physchim62 (talk) 20:24, 14 June 2009 (UTC)
- That applies only to Franamax' method of working - I use the k/b as much as possible, incl when copying and pasting. --Philcha (talk) 19:58, 14 June 2009 (UTC)
- Yeah guys, it's a metaphor, not a hard physical test. :) The essence is whether you are typing in words conjured up from your own reading of the sources, or whether you're just copying someone else's writing and putting your own name on it via the edit history.
- Physchim, sure, if you think that unattributed copying of other people's words is perfectly acceptable, make your case and gain community consensus. I'd suggest WP:VPR as a good place to start. If it's as clear-cut as you seem to imply, you should have no problem getting agreement. Franamax (talk) 21:11, 14 June 2009 (UTC)
- As I said above, I would heartily support an MfD of this entire page as a waste of our precious time and resources. Physchim62 (talk) 21:17, 14 June 2009 (UTC)
- I am not sure I would go so far as to say that – although much of what this page is trying to do is covered by our copyright policies. The use of public domain sources is a notable exception. JN466 21:44, 14 June 2009 (UTC)
- Even the use of public domain resources is largely covered by other guidelines: think of WP:MOS, WP:CITE, WP:BETTER… Physchim62 (talk) 22:30, 14 June 2009 (UTC)
- Hmm. I think a case could be made for including any parts of this guideline that really do help people write better articles, in a practical way, in something like WP:BETTER or WP:CITE. JN466 04:56, 15 June 2009 (UTC)
- In which case, why not start a wikibook or a wikiversity course? The guideline is worse than redundant at Wikipedia. Physchim62 (talk) 15:29, 15 June 2009 (UTC)
- I'm still trying to figure out from your last meltdown concerning this issue whether you really believe that plagiarism isn't a serious ethical problem or if you just get off on visions of shoving ideas down throats (including, of course, your own). DocKino (talk) 06:51, 16 June 2009 (UTC)
- In which case, why not start a wikibook or a wikiversity course? The guideline is worse than redundant at Wikipedia. Physchim62 (talk) 15:29, 15 June 2009 (UTC)
- Hmm. I think a case could be made for including any parts of this guideline that really do help people write better articles, in a practical way, in something like WP:BETTER or WP:CITE. JN466 04:56, 15 June 2009 (UTC)
- Even the use of public domain resources is largely covered by other guidelines: think of WP:MOS, WP:CITE, WP:BETTER… Physchim62 (talk) 22:30, 14 June 2009 (UTC)
- I am not sure I would go so far as to say that – although much of what this page is trying to do is covered by our copyright policies. The use of public domain sources is a notable exception. JN466 21:44, 14 June 2009 (UTC)
- As I said above, I would heartily support an MfD of this entire page as a waste of our precious time and resources. Physchim62 (talk) 21:17, 14 June 2009 (UTC)
- Physchim, sure, if you think that unattributed copying of other people's words is perfectly acceptable, make your case and gain community consensus. I'd suggest WP:VPR as a good place to start. If it's as clear-cut as you seem to imply, you should have no problem getting agreement. Franamax (talk) 21:11, 14 June 2009 (UTC)
Another sentence
What about "paraphrasing sufficiently to avoid copyright violation does not prevent plagiarism" in the lead – does this make sense for what we are trying to do here? The project is trying to make "all the knowledge of humanity" freely available. In a sense, everything in here is plagiarised, nothing is original research. As long as copyrights are respected, what exactly is the problem?
In fact, the whole lead para is odd:
Plagiarism is the incorporation of someone else's work without providing adequate credit. Even if a source is cited, plagiarism also occurs when text is directly copied without proper attribution.[1] When the source is under copyright and not under a free license, the copy may represent a copyright violation, but paraphrasing sufficiently to avoid copyright violation does not prevent plagiarism.
The second sentence seems contradictory. What "proper attribution", in addition to citing the source, would make directly copied text "not plagiarism"? Is it just me? The sentence is opaque. JN466 21:44, 14 June 2009 (UTC)
- Given that avoiding plagiarism means properly acknowledging sources and Wikipedia is rather big on acknowledging sources, how can avoiding plagiarism not make sense for what we are trying to do here? Nothing says, "Don't use it." It says, "Use it properly."
- Proper attribution means quotation marks, I would imagine. I didn't write that passage, but that's how it was formerly characterized at WP:NFC. --Moonriddengirl (talk) 12:20, 15 June 2009 (UTC)
- I guessed it meant that too, but a guideline should not be written in such a way that people sit around it and guess what it might mean. As for the first point, are you saying that the statement "paraphrasing sufficiently to avoid copyright violation does not prevent plagiarism" applies only if no source is credited? Or does it also apply if a source is cited? The text does not make this clear. I thought it also applied if a source is cited, because the prior sentence starts "Even if a source is cited, ..." so I assumed that was a given. JN466 12:29, 15 June 2009 (UTC)
- No. Proper acknowledgment of sources includes indicating when the language is duplicated from the original. If language is to be revised, it should be adequately revised with not a few words switched around here and there. Better to quote than improperly paraphrase. --Moonriddengirl (talk) 12:44, 15 June 2009 (UTC)
- I'm sorry, I don't understand. Are you saying, "paraphrasing a cited source sufficiently to avoid copyright violation does not prevent plagiarism"? --JN466 13:53, 15 June 2009 (UTC)
- The short story: legally, yes. The long story: Wikipedia's copyright policies are written to a stricter standard than US copyright law, in that we do not recognize "fair use" in the same way that US law does and that we do not recognize "de minimis" exclusions. We require that all copied copyrightable material be denoted as set out at WP:NFC unless it is compatibly licensed. We do not say, "Oh, yes, you copied 95% of that sentence from that source, but a single sentence that is not key to either doesn't rise to substantial similarity, so no copyright violation has occurred." With copyrighted text, we put it in quotation marks, or we don't use it. This guideline has been poked at so much that I don't know when that entered into the language (maybe even before I arrived?), but legally, technically, it is correct. I am not saying that "'paraphrasing sufficiently to avoid copyright violation does not prevent plagiarism' applies only if no source is credited"; I am saying proper attribution is necessary, and that includes the use of quotation marks for duplicative runs of distinctive text or some other Wikipedia-accepted notation of duplication. --Moonriddengirl (talk) 14:10, 15 June 2009 (UTC)
- So you are saying, even if I have reformulated an – explicitly cited – source enough so that no one can accuse WP of copyright violation, I am still guilty of plagiarism if I do not mark in quotation marks any duplicative runs of words in my reformulated sentence. Is that correct? JN466 14:55, 15 June 2009 (UTC)
- Yes, by the legal definition of "copyright" in the United States (which governs Wikipedia) and by the prevailing definition of plagiarism as used in the sources cited here and elsewhere (at least the ones with which I'm familiar), that is correct. If you, for example, properly paraphrase three paragraphs out of an article in the New York Times, but include two sentences nearly verbatim, you are unlikely to have risen to the level of "copyright violation" (even though are you in violation of WP:C), but you will still be guilty of plagiarism for not having noted that your source supplied the language as well as the facts. :) --Moonriddengirl (talk) 15:10, 15 June 2009 (UTC)
- So you are saying, even if I have reformulated an – explicitly cited – source enough so that no one can accuse WP of copyright violation, I am still guilty of plagiarism if I do not mark in quotation marks any duplicative runs of words in my reformulated sentence. Is that correct? JN466 14:55, 15 June 2009 (UTC)
- The short story: legally, yes. The long story: Wikipedia's copyright policies are written to a stricter standard than US copyright law, in that we do not recognize "fair use" in the same way that US law does and that we do not recognize "de minimis" exclusions. We require that all copied copyrightable material be denoted as set out at WP:NFC unless it is compatibly licensed. We do not say, "Oh, yes, you copied 95% of that sentence from that source, but a single sentence that is not key to either doesn't rise to substantial similarity, so no copyright violation has occurred." With copyrighted text, we put it in quotation marks, or we don't use it. This guideline has been poked at so much that I don't know when that entered into the language (maybe even before I arrived?), but legally, technically, it is correct. I am not saying that "'paraphrasing sufficiently to avoid copyright violation does not prevent plagiarism' applies only if no source is credited"; I am saying proper attribution is necessary, and that includes the use of quotation marks for duplicative runs of distinctive text or some other Wikipedia-accepted notation of duplication. --Moonriddengirl (talk) 14:10, 15 June 2009 (UTC)
- I'm sorry, I don't understand. Are you saying, "paraphrasing a cited source sufficiently to avoid copyright violation does not prevent plagiarism"? --JN466 13:53, 15 June 2009 (UTC)
- No. Proper acknowledgment of sources includes indicating when the language is duplicated from the original. If language is to be revised, it should be adequately revised with not a few words switched around here and there. Better to quote than improperly paraphrase. --Moonriddengirl (talk) 12:44, 15 June 2009 (UTC)
- I guessed it meant that too, but a guideline should not be written in such a way that people sit around it and guess what it might mean. As for the first point, are you saying that the statement "paraphrasing sufficiently to avoid copyright violation does not prevent plagiarism" applies only if no source is credited? Or does it also apply if a source is cited? The text does not make this clear. I thought it also applied if a source is cited, because the prior sentence starts "Even if a source is cited, ..." so I assumed that was a given. JN466 12:29, 15 June 2009 (UTC)
This said, are we agreed that the above paragraph is unlikely to help editors understand what we want them to do when they write articles? If so, can we work on fixing the lead? JN466 15:03, 15 June 2009 (UTC)
Plagiarism is the incorporation of someone else's work without providing adequate credit. Even if you have cited a source, make sure that your wording does not duplicate that of the source. And even if you have reformulated a source enough to avoid copyright infringement, you should still enclose in quotation marks any duplicative runs of words that also occurred in the source, to avoid plagiarising the source, or better still, reformulate to such an extent that all your wording is original.
Is this what this paragraph is asking people to do? JN466 15:07, 15 June 2009 (UTC)
- I like that for the most part, but I have my own rhythm to drum, so to speak, so I myself would word that differently. Since Wikipedia's copyright policies don't allow for de minimis exceptions, I'm for removing anything that creates confusion. I myself would prefer:
Plagiarism is the incorporation of someone else's work without providing adequate credit. Even if you have cited a source, make sure that your wording does not duplicate that of the source unless you note duplication by quotation marks or other acceptable method (such as block quotations), even if your source is not copyrighted.
- One problem that needs to be addressed (and that creates the vaguery in my note) in my language or yours is that currently Wikipedia's accepted method includes simply pasting in chunks of text with an attribution template or other note that indicates that the text has been pasted in: "In addition to the edit summary note, be sure to attribute the material either by using blockquotes or quotation marks, by using an attribution template, using an inline citation and/or adding your own note in the reference section of the article to indicate that language has been used verbatim." How to succinctly indicate that in the lead, I don't know. --Moonriddengirl (talk) 15:17, 15 June 2009 (UTC)
Okay. That is, just from a language point of view, a big improvement on what we have, and we can work on the part about attribution templates etc. Another thing though: If WP:C is already stricter than copyright law, and requires us to use quotation marks for distinctive text used verbatim and so forth, do our instructions here actually add anything that is not already covered in WP:C and Wikipedia:FAQ/Copyright#Can I add_something to Wikipedia that I got from somewhere else?? If so, what is the unique requirement being added here? JN466 16:49, 15 June 2009 (UTC)
- WP:C (and the copyright FAQ) doesn't apply to public domain text or text licensed compatibly with Wikipedia. WP:Plagiarism does. :) --Moonriddengirl (talk) 16:53, 15 June 2009 (UTC)
- If that means that, apart from PD sources and sources with WP-compatible licences, everything covered here is already covered by WP:C and its FAQ, would it not be best to restrict the topic of this guideline to public domain sources and sources with WP-compatible licences? JN466 17:12, 15 June 2009 (UTC)
- Everything in the guideline is not covered by WP:C, though some of it is. For example, as long as you paraphrase it properly, you can use those three paragraphs from the NYT without violating copyright even if you deliberately do not cite your source. The law doesn't care if you give credit for your information. But it's still plagiarism. Plagiarism & copyright are closely intertwined, but separate at some points. It's possible to have one without the other; it's also possible to have both at once. There is always going to be some overlap between policies & guidelines where the two intersect, just as WP:V and WP:OR overlap. Theoretically, we could bundle WP:V, WP:OR, WP:C and WP:Plagiarism all into one mega policy called something like "Use your sources properly." We could probably even fit WP:BLP and WP:NPOV in there if we were adroit enough. :) --Moonriddengirl (talk) 17:18, 15 June 2009 (UTC)
- Well, I'd still say that we should minimise the overlap, as it is apt to cause confusion. And we have to think practically. We have to know what specific user behaviours we want to encourage, and what text is likely to get them to see the point in such a way that they will adjust their behaviour.
- So the two main points seem to be: (1) Always acknowledge your sources, whether free or non-free (2) Don't duplicate any (free or non-free) source's wording without acknowledging and making it explicit to the reader that you have duplicated their wording.
- If this is correct, then I'd suggest that we jettison all the theory that does not directly support those points, or is more likely to leave editors tying knots in their brains thinking how plagiarism is both the same and different from copyright. And that we do not repeat anything which editors are already required to observe as per WP:C policy. We can just point to that policy where appropriate. Perhaps a little paragraph about what this guideline adds that is not already addressed in WP:C policy would help too. JN466 17:35, 15 June 2009 (UTC)
- I'd agree those are the main points. We have worked hard on avoiding the bits where plagiarism overlaps with copyvio, by trimming to just mention "may also be". It's been difficult, because often when people discuss plagiarism, they use words more properly associated with copyright law. Unfortunately, some minimal amount of theory is needed here simply because of the overlap between the two different concepts.
- Jayen, is there anything in particular you think should be taken out, beyond your last clarified wording? For instance, I would think that "Definitions of plagiarism" is an important section to explain exactly what "don't duplicate" actually means (or more precisely, to explain a little about it's non-exactness). Franamax (talk) 08:04, 16 June 2009 (UTC)
- Followup, I read the entire thing once more and made a few tweakies. [7] Some of these were to pull out minor overlaps with copvio and of course review on any changes is welcome. The last two sections are still overlappy, "Repairing plagiarism" and "Copyright violations". Since MRG has so well expressed the nuances above (and plainly has almost nothing to do around en:wiki ;), I'll give her the first crack at restructuring those, or yourself Jayen. Both those sections read in a tone which seems to unnecessarily conflate the copyvio and plagio policy/guidelines. It is indeed difficult to remember anymore who added what portions when. Franamax (talk) 09:08, 16 June 2009 (UTC)
- I've trimmed the copyright violations section a tiny bit. I think we need to keep something there, because I have many times seen people accusing others of plagiarism when what they're actually doing is copyright violation. It may need to be truncated further. Or perhaps what we need to do is shorten with a clearly declarative "If you see this, then you're dealing with copyright violation, which trumps plagiarism. Go to policy X and guideline Y." Hmm. Meanwhile, I'll go look at repairing plagiarism. --Moonriddengirl (talk) 11:31, 16 June 2009 (UTC)
- Everything in the guideline is not covered by WP:C, though some of it is. For example, as long as you paraphrase it properly, you can use those three paragraphs from the NYT without violating copyright even if you deliberately do not cite your source. The law doesn't care if you give credit for your information. But it's still plagiarism. Plagiarism & copyright are closely intertwined, but separate at some points. It's possible to have one without the other; it's also possible to have both at once. There is always going to be some overlap between policies & guidelines where the two intersect, just as WP:V and WP:OR overlap. Theoretically, we could bundle WP:V, WP:OR, WP:C and WP:Plagiarism all into one mega policy called something like "Use your sources properly." We could probably even fit WP:BLP and WP:NPOV in there if we were adroit enough. :) --Moonriddengirl (talk) 17:18, 15 June 2009 (UTC)
- If that means that, apart from PD sources and sources with WP-compatible licences, everything covered here is already covered by WP:C and its FAQ, would it not be best to restrict the topic of this guideline to public domain sources and sources with WP-compatible licences? JN466 17:12, 15 June 2009 (UTC)
←Actually, as I was looking at it, I decided abbreviating it with a declarative was better. I also moved it up, because it really is "step 1": if you find a problem, take it there if it's copyvio. What do you think? --Moonriddengirl (talk) 11:38, 16 June 2009 (UTC)
- I'll go on record as loving the change to the section, it's distilled to the essence. As far as the placement, nehh, not so much. Possibly as the first sub-section in How to respond? Wouldn't that be the best place for Step 1? Franamax (talk) 12:31, 16 June 2009 (UTC)
- That's fine by me. :) --Moonriddengirl (talk) 12:38, 16 June 2009 (UTC)
- Sorry for the lack of response, Franamax; hope to come back to this tonight. JN466 15:31, 17 June 2009 (UTC)
Plagiarism after the fact ?
Is it plagiarism if:
- We use a source X to add material A (properly paraphrased and cited) to a wikipedia article, and then
- Days/months/years later we remove the source X (perhaps because we consider it to be non-RS), without removing the material A ?
This is not exactly a hypothetical question: I wondered about in responding to a recent query at RSN (diff). If this is indeed a form of plagiarism (as I suspect), should we mention somewhere that "editors removing citations from an article should make a good faith effort to remove material directly sourced from it (if any) or provide alternate references" ? Abecedare (talk) 03:21, 15 June 2009 (UTC)
- Easy to determine from the edit history. If that scenario is what it's claimed to be then the problem isn't plagiarism; it's blanking vandalism. DurovaCharge! 03:25, 15 June 2009 (UTC)
- I am not sure I understand what you mean about "blanking vandalism". Neither of the two actions I mention above are vandalism, but their end result is that we have incorporated material from a source without giving it credit. My question is if that is plagiarism. (Let me know if my question or scenario itself is unclear.) Abecedare (talk) 03:36, 15 June 2009 (UTC)
- To clarify: In the case above the material A is perfectly acceptable for the article. Abecedare (talk) 03:38, 15 June 2009 (UTC)
- From the context, i think by "blanking vandalism" Durova means deletion of material (in this case the citation) destructively, amounting to vandalism. And by the definition of plagiarism as having less attribution than is reasonably expected, then yes, the deletion of the attribution has brought the article into the status of being plagiarized. doncram (talk) 04:20, 15 June 2009 (UTC)
- I agree it would amount to the same thing as not attributing the material in the first place. JN466 04:52, 15 June 2009 (UTC)
- Actually Jayen466's and my view on this is perhaps not universally held. In several discussions, others have argued that attribution can be established by edit summary upon original paste-in of copied material, or perhaps by other approaches which could leave some record in the edit history of an article. However, to detect such hidden, previous attribution would not be reasonable to expect of the general wikipedia reader. Just as we do not want to be required to learn the intricacies of some other GFDL website to determine the sourcing of its material before we quote from it, others should not have to scour the past edit history. I think it is only reasonable to figure that adequate attribution has to be visible in the current article to be counted. In this case, there could be past attribution in the history, but i personally think that is not relevant to saying whether the current article is in a state of plagiarization. Note, there could be multiple, conflicting attributions in the previous history. Only the current article shows what the wikipedia editor collective is putting forward as reflecting the editors' consensus decision on what is fair attribution. doncram (talk) 01:05, 16 June 2009 (UTC)
- Let's see whether there's any misunderstanding.
- Editor 1 adds cited material, properly paraphrased and referenced.
- Editor 2 removes reference but leaves material.
- Unexplained blanking of a proper citation is blanking vandalism. WP:VANDAL is down the hall and to the right. DurovaCharge! 03:52, 16 June 2009 (UTC)
- But the catch is that while the material Editor 1 added is ok, the citation is non-RS. So editor #2 is just removing a non-reliable citation in good faith, and that certainly is not vandalism.
- Let me lay down the particular circumstances that raised the question in my mind:
- http://www.kirjasto.sci.fi/kalendar.htm is a website where Petri Liukkonen, a librarian from Finland, has posted biographical essays on 100s of authors since 1999. Before wikipedia, it was perhaps the single
bestmost convenient online source for such information, and even now some of its articles are better written than their wikipedia counterparts. - The website is used in over 300 articles on wikipedia as a reference and/or external link. In fact, several wikipedia article were started from material paraphrased from the website, at least as far back as 2002 (example)
- Recently a question was raised at RSN about the reliability of the website and I (as a responder) opined that it does not qualify as a "reliable, published sources with a reputation for fact-checking and accuracy", mainly because Petri is not a published expert in the area, and there is seemingly no editorial oversight.
- Assuming that my judgment about the website is correct, can an editor go around and blindly remove all the citation to this source ? Or, does he need to ensure that any material still directly traceable to the website is also removed or alternately sourced in order to avoid plagiarism ?
- http://www.kirjasto.sci.fi/kalendar.htm is a website where Petri Liukkonen, a librarian from Finland, has posted biographical essays on 100s of authors since 1999. Before wikipedia, it was perhaps the single
- Note that I am not asking for opinion about www.kirjasto.sci.fi per se, just the policy issue. The same scenario could have been presented with regards to the "Trivia" section on www.imdb.com, which again is widely used as a source on wikipedia and has been repeatedly pronounced to be non-RS at RSN (some other parts of imdb are considered reliable though, but that's not the issue here).
- Hope that makes the query clearer. Abecedare (talk) 04:43, 16 June 2009 (UTC)
- Let's see whether there's any misunderstanding.
- Actually Jayen466's and my view on this is perhaps not universally held. In several discussions, others have argued that attribution can be established by edit summary upon original paste-in of copied material, or perhaps by other approaches which could leave some record in the edit history of an article. However, to detect such hidden, previous attribution would not be reasonable to expect of the general wikipedia reader. Just as we do not want to be required to learn the intricacies of some other GFDL website to determine the sourcing of its material before we quote from it, others should not have to scour the past edit history. I think it is only reasonable to figure that adequate attribution has to be visible in the current article to be counted. In this case, there could be past attribution in the history, but i personally think that is not relevant to saying whether the current article is in a state of plagiarization. Note, there could be multiple, conflicting attributions in the previous history. Only the current article shows what the wikipedia editor collective is putting forward as reflecting the editors' consensus decision on what is fair attribution. doncram (talk) 01:05, 16 June 2009 (UTC)
- I think your hypothetical not-so-reliable source is the same as any other source:
- If the WP article had a direct quote or near-quote from it without attribution, that would have plagiarism.
- After the source was considered not reliable, any material based only on it could be removed per WP:V.
- Or the inline citation could be replaced with a "needs citation" tag.
- If the citation and the attribution "X says, ..." are removed but the direct quote or near-quote remains, that's plagiarism - unless an exemption like those discussed above applies. --Philcha (talk) 06:36, 16 June 2009 (UTC)
- I'd agree with the first three points there, but not necessarily with the last. I'd call the last scenario incredibly sloppy and perhaps disruptive editing. Removing a cite without addressing the content on which it relies is just plain bad practice, regardless of whether the source has been deemed unreliable. We're about the product, not the noticeboards, right?
- A vaguely similar scenario was raised by Carcharoth just recently but didn't get much discussion. I wouldn't term either case "plagiarism", since there is no original intent, there is no original accident, and there is no verifiable conspiracy to mislead. There is just "oh shit, oopsie!" - the result of editing on-the-fly, not paying close attention.
- Philcha, on your proposed exemptions, it sounds to me as though Abecedare's scenario involves some copying of "distinct and original wording", so I'd be interested in your examples of how the exemptions might apply. Franamax (talk) 09:32, 16 June 2009 (UTC)
- Franamax, I disagree with "Removing a cite without addressing the content on which it relies is just plain bad practice". Your statement is too sweeping, and I can think of cases where I'd use a "needs citation" tag in the short term, so the editor(s) have time look for alternatives - a bot will date the tag, and if it's not remedied in a few months time then the statemnt(s) can be removed. --Philcha (talk) 10:00, 16 June 2009 (UTC)
- That's precisely why I agreed with your third point. Adding {{cn}} does qualify as "addressing the content". Well, it does in a way - when I find myself adding the tag without having spent at least an hour trying to resolve the whole thing, I usually feel like I need a wiki-decontamination shower. My comment was on the notion of removing only the citation and doing nothing else. I'd call that pretty sloppy editing. Franamax (talk) 11:43, 16 June 2009 (UTC)
- Re the exemptions, it would depend on the situation, especially the words at issue. If it's a vivid phrase, e.g. (hypothetically) "arthropods are living Lego systems", the exemption for a phrase that's become common might apply. If it's just plain English with no fancy phrases, the "simple, obvious phrasing" exemption might apply, and in that case I'd remove any quotes round it. But in the case Abecedare presented, that's all moot as the content fails WP:V+WP:RS. --Philcha (talk) 10:08, 16 June 2009 (UTC)
- So what would a concrete example of a vivid phrase in common usage be that would be acceptable? Commercial phrases generally will not do , "aspirin" may be generic, but "Aspirin saved my love life!" is probably TM of Bayer Corp. Vivid phrases within writing beyond Shakespeare and Wilde, I'd like to see the justification. Have any of Churchill's many distinctive phrasings been placed into the domain where they need not be attributed? As far as science writing goes, what would be be the justification for an editor to copy the "arthropods are living Lego systems" without explicit quote marks when the editor could equally easily state that "arthropods are living and historical examples of the power of the homeobox and gene synteny" and then go on to explicate genetics with several hundred citations? Or better yet, quote the distinctive phrase, then explain it in detail.
- And I can't myself dismiss Abecedare's case, what with how incredibly much of the content of our millions of articles fails V+RS. We do have to deal with reality, and that's what it is. Franamax (talk) 12:12, 16 June 2009 (UTC)
- Addendum: Plagiarism automatically satisfies the verifiabilty policy, n'est-ce pas? Franamax (talk) 12:21, 16 June 2009 (UTC)
- verifiabilty requires a reliable source. There is already a discussion at the Reliable Sources noticeboard. Although there were few participants it seems clear that this source does not meet the criteria for a reliable source since it is self published. Abecedare brought up the point about plagiarism if the reference is removed which to me overrides the problem with reliable sources. But this could set a precedent to allow other non-reliable sources to be kept if it is too difficult to remove the reference without running into copyrights problems. -Crunchy Numbers (talk) 21:22, 18 June 2009 (UTC)
- Franamax, I disagree with "Removing a cite without addressing the content on which it relies is just plain bad practice". Your statement is too sweeping, and I can think of cases where I'd use a "needs citation" tag in the short term, so the editor(s) have time look for alternatives - a bot will date the tag, and if it's not remedied in a few months time then the statemnt(s) can be removed. --Philcha (talk) 10:00, 16 June 2009 (UTC)
- I think your hypothetical not-so-reliable source is the same as any other source:
- Franamax, your request for an example has already been met (by me) in another section of this Talk page, with refs. Look for "Dead Clade Walking", "Lazarus taxon" and "Elvis taxon", which are common in paleontology (sorry, these jumped at me because I've worked a lot on paleontology articles). In less academic ares, "gold standard" meaning "best of its genre" must have been copied by someone, on the analogy of monetary systems, and is now common usage; so are "stymied" (from golf) and "stalemate" (from chess, but incorrectly adapted); in computing, "second system effect" was AFAIK coined by Fred Brooks in The Mythical Man Month and is now common; ... I'd better stop there. --Philcha (talk) 23:47, 19 June 2009 (UTC)
Restructuring "How to Respond to Plagiarism"
Per conversation several threads above, I have rewritten this section. Since my changes are rather sweeping, I'm placing them here for transparency. I'm turning the headers into bold, because I don't want to make the TOC on this page completely wonky.
This is what it used to say
|
---|
How to respond to plagiarism Failure to properly attribute text may be intentional, but it is often inadvertent. Avoiding plagiarism requires mastery of citation and paraphrasing. Contributors need to know when and how to cite sources. When paraphrasing, they need to know how much they can and should retain without following too closely on source text. They also need to remember when and where they saw something first, both in active research, while note taking, and during composition, to avoid unconscious plagiarism.[2] Contact the editor involved An accusation of plagiarism is a serious charge. Please use care to frame concerns in an appropriate way. Even in blatant, conspicuous cases, it is important to remain civil. Given that attribution errors may be inadvertent, intentional plagiarism should not be presumed in the absence of strong evidence. Remember to start with the assumption of good faith. While it is essential that plagiarism problems be resolved, we also aim – wherever possible – to educate our editors, so that they may become better contributors to the encyclopedia. Many editors are unaware that they have violated any guidelines when copying or closely paraphrasing published material. The best approach is to contact the editor and make sure they understand the requirements for attribution at English Wikipedia. Invite the editor to identify and repair any and all instances of plagiarism. Remember that they may not be familiar with the concept of plagiarism, and may be defensive when their prior edits are being challenged. In the case of outright copyright violations, the editor should also be made aware of the applicable policies and law which forbid these. Simple plagiarism can be approached more gently. Seek administrator assistance If you find that an editor persists in plagiarising others' work after being notified of this guideline, report him or her at the administrators' noticeboard so that an administrator can respond to the issue. Be sure to include diffs which show both the plagiarism and warnings which were given and ignored. Repairing plagiarism Sometimes material from a copyrighted work is copied into Wikipedia with minimal rewriting. This may still be a violation of copyright as a derivative work, and the same concerns about plagiarism would apply if the phrases, concepts and ideas in the copied material are not attributed to the original author. If the text follows closely enough on the original in structure, presentation, and phrasing to raise copyright concerns, handle it as a copyright violation. If it does not, address it as plagiarism. Plagiarism doesn't have to be immediately removed, unlike copyright violations. It does need to be properly attributed to its source. If you find an example of plagiarism, where an editor has copied text, media or figures into Wikipedia without proper attribution, contact the editor responsible, point them to this guideline page and ask them to provide the proper attribution. It may also be helpful to politely refer them to Wikipedia:Verifiability, Wikipedia:Citing sources, and/or Help:Citations quick reference. Editors who have difficulties or questions about this guidance can be referred to the Help Desk or media copyright questions. You can also change the copied material or provide the attribution or source on your own. Material that is plagiarized but which does not violate copyright does not need to be removed from Wikipedia if it can be properly sourced. Add appropriate source information to the article or file page wherever possible. With text, you might move unsourced material to an article's talk page until sources can be found. |
This is what it says now
|
---|
How to respond to plagiarism If you find an example of plagiarism, where an editor has copied text, media or figures into Wikipedia without proper attribution, contact the editor responsible, point them politely to this guideline page and ask them to provide the proper attribution. Please use care to frame concerns in an appropriate way, as an accusation of plagiarism is a serious charge. Even in blatant, conspicuous cases, it is important to remain civil. Given that attribution errors may be inadvertent, intentional plagiarism should not be presumed in the absence of strong evidence.[6] Remember that contributors may not be familiar with the concept of plagiarism or that their definition may differ from that adopted by Wikipedia. Remember to start with the assumption of good faith. It may also be helpful to politely refer them to Wikipedia:Verifiability, Wikipedia:Citing sources, and/or Help:Citations quick reference. Editors who have difficulties or questions about this guidance can be referred to the Help Desk or media copyright questions. In addition to requesting repair of the first instance, you may wish to invite the editor to identify and repair any other instances of plagiarism they may have placed prior to becoming familiar with our guideline. If you find that an editor persists in plagiarising others' work after being notified of this guideline, report him or her at the administrators' noticeboard so that an administrator can respond to the issue. Be sure to include diffs which show both the plagiarism and warnings which were given and ignored. Repairing plagiarism It may not always be feasible to contact the contributor. For example, an IP editor who placed text three years ago and has not edited since is unlikely to be available to respond to your concerns. Whether you are able to contact the contributor or not, you can also change the copied material or provide the attribution or source on your own. Material that is plagiarized but which does not violate copyright does not need to be removed from Wikipedia if it can be repaired. Add appropriate source information to the article or file page wherever possible. With text, you might move unsourced material to an article's talk page until sources can be found. |
I've moved a few sentences into footnote, which I'm not expanding here. But it's visible currently on the guideline page.
If there are major problems, feel free to revert while we discuss. If there are minor problems, please, just go ahead and fix them. If there are iffy things, we can discuss. :) --Moonriddengirl (talk) 11:58, 16 June 2009 (UTC)
- My first comments are:
- The preamble portion of "How to respond to plagiarism" seems to be missing. This seemed to me as an important bit, as in "don't freak out and start making accusations". Was it your intent to omit it?
- And the piece on "Seek administrator assistance" seems to be gone. I'm quite agnostic on this - throughout the history of this page, I've never been quite clear on whether continued plagiarism is a matter for admin attention, or whether it's "hell yes, everybody does it". Each archive page of this talk will likely show those viewpoints. Are there eventual sanctions?
- And I may have found some minor wordings I could work on, but likely best to consider those points first imoo. (IMOO - hey I just made a new acronym "In my opinion only") Franamax (talk) 12:51, 16 June 2009 (UTC)
- Preamble: yes, it was redundant to material in contacting the contributor. I have incorporated it there, some in footnote. Seeking administrator assistance has also been incorporated there, since it also concerns addressing the contributor. (Maybe the subsection header should be changed. (ETA: as I have just done, for now.) And I like the bovine associations with your new acronym.) --Moonriddengirl (talk) 13:01, 16 June 2009 (UTC)
- Yes, admins can and will block editors who insist on plagiarising material; I've done it, and I know several others who have as well. A refusal to accept guidance on how to avoid plagiarism is damaging and disruptive to the project, and eminently blockable. I don't think that this point has ever been in dispute at AN/I. TenOfAllTrades(talk) 13:13, 16 June 2009 (UTC)
- If any admin blocks an editor for the sole reason of "plagiarism", I will ask ArbCom to take the case immediately. Physchim62 (talk) 13:28, 16 June 2009 (UTC)
- You're unlikely to find anybody being blocked for the sole reason of plagiarism, I should think, since even as written now the guideline says "persists in plagiarising others' work after being notified of this guideline", which is by definition rejecting community input which is disruptive editing which is blockable ("persistently violating other policies or guidelines.") But it'll be interesting to see what happens. --Moonriddengirl (talk) 13:49, 16 June 2009 (UTC)
- If any admin blocks an editor for the sole reason of "plagiarism", I will ask ArbCom to take the case immediately. Physchim62 (talk) 13:28, 16 June 2009 (UTC)
- Yes, admins can and will block editors who insist on plagiarising material; I've done it, and I know several others who have as well. A refusal to accept guidance on how to avoid plagiarism is damaging and disruptive to the project, and eminently blockable. I don't think that this point has ever been in dispute at AN/I. TenOfAllTrades(talk) 13:13, 16 June 2009 (UTC)
- (ec) The reason is usually more along the lines of "recurring plagiarism; refuses to stop despite advice and warnings". Standard practice – described correctly in this guideline – is to offer guidance to the editor involved first. If advice and warnings fail to effect a change in behaviour, blocks are a legitimate next step. Even then, a blocked editor who expresses a plausible desire to reform his conduct will usually be granted an unblock.
- I doubt that ArbCom would even accept a case under those circumstances, but YMMV. While this document only has a {guideline} tag on it (per your comment on my talk page), that should not be misunderstood to mean that avoiding plagiarism is optional on Wikipedia, nor to mean that that is a new or novel thing for us. Repeated plagiarism has always been blockable; this guideline is intended as a badly-needed reference for both editors and admins, to try to keep everyone on the same page.
- More important, I don't believe anyone is suggesting that we would block an editor just for not knowing how to (for example) use <ref> tags. (Let's be honest — I don't know how to use <ref> tags.) No editor who makes a good-faith effort to cite his sources should have anything to worry about. We have scores of wikignomes who just love to turn inline links and Harvard references into reams of beautifully-formatted footnotes. The only editors who have ever been blocked for plagiarism are the ones who simply refuse to acknowledge the sources from which they lifted material into Wikipedia. TenOfAllTrades(talk) 14:09, 16 June 2009 (UTC)
- And how many editors do you expect to find? Why the new guideline, indeed, when you accept that your purely hypothetical case could be dealt with under disruptive editing? None of these supposed blocks would fit within WP:BLOCK, as WP:GUIDELINE points out that "guidelines" are only advisory. Physchim62 (talk) 14:26, 16 June 2009 (UTC)
- That would be a problem to take up at WP:BLOCK, since that's what I'm quoting above, and it says, again, "persistently violating other policies or guidelines" is blockable. If you don't think violating guidelines is blockable as they are only advisory, that should probably be dealt with there. --Moonriddengirl (talk) 14:36, 16 June 2009 (UTC)
- For the admin that blocks, it would be a case to take up at ArbCom, I can assure you. I would prefer discussion to continue, but with the lack of justification and disputed support for this "guideline", the ever-increasing megalomany shown by its proponents, I am minded to take the whole thing to WP:MFD. Can you give me one reason why I shouldn't? Physchim62 (talk) 14:55, 16 June 2009 (UTC)
- Because the majority of responders at the properly publicized RfC stated a preference that it be elevated to guideline? But you're free to do whatever you like. I'm not sure whose megalomania is disturbing to you, but I tend to suspect that such statements aren't helpful. So far, consensus supports a guideline, even if you do not. As far as I know, everyone working to craft one is doing so in a good faith effort to help out, not out of some misguided power struggle. --Moonriddengirl (talk) 15:10, 16 June 2009 (UTC)
- For the admin that blocks, it would be a case to take up at ArbCom, I can assure you. I would prefer discussion to continue, but with the lack of justification and disputed support for this "guideline", the ever-increasing megalomany shown by its proponents, I am minded to take the whole thing to WP:MFD. Can you give me one reason why I shouldn't? Physchim62 (talk) 14:55, 16 June 2009 (UTC)
- That would be a problem to take up at WP:BLOCK, since that's what I'm quoting above, and it says, again, "persistently violating other policies or guidelines" is blockable. If you don't think violating guidelines is blockable as they are only advisory, that should probably be dealt with there. --Moonriddengirl (talk) 14:36, 16 June 2009 (UTC)
- And how many editors do you expect to find? Why the new guideline, indeed, when you accept that your purely hypothetical case could be dealt with under disruptive editing? None of these supposed blocks would fit within WP:BLOCK, as WP:GUIDELINE points out that "guidelines" are only advisory. Physchim62 (talk) 14:26, 16 June 2009 (UTC)
I'm sorry, much as I respect both of my interlocutors here, this is going WAY too far. I have nominate the page for deletion in an attempt to put an end to this delirium. Physchim62 (talk) 14:42, 17 June 2009 (UTC)
- I respect your right to object, though I do believe that this is the "will of the community" and all. If I understand your position, I believe that you are fundamentally opposed to any effort to address or define plagiarism in respect to the project, but should consensus weigh against you, I still think that your critical input could help keep the guideline clear and fair, even if it is not a guideline that you personally support. --Moonriddengirl (talk) 15:13, 17 June 2009 (UTC)
Specific language
Though things are quiet on this front ATM, I have moved the AGF matter further up in the revised text to be sure that it isn't missed. Those coming late to this may note that there is also now a section on copyright in the guideline at that point; it was move from elsewhere as "step 1." Language issues? --Moonriddengirl (talk) 18:10, 16 June 2009 (UTC)
- ^ The American Historical Association terms this "recent or distinctive findings and interpretations, those not yet a part of the common understanding of the profession." American Historical Association (2004-12-09). "Statement on Standards of Professional Conduct". Retrieved 2009-04-29.
{{cite web}}
: Unknown parameter|site=
ignored (help) - ^ See Perfect, Timothy J.; Stark, Louisa J. (2008). "Tales from the Crypt...omnesia". In John Dunlosky, Robert A. Bjork (ed.). Handbook of Metamemory and Memory. CRC Press. pp. 285–314. ISBN 0805862145.
{{cite book}}
:|access-date=
requires|url=
(help); External link in
(help); Unknown parameter|chapterurl=
|chapterurl=
ignored (|chapter-url=
suggested) (help).