Jump to content

Wikipedia talk:Wikipedia Signpost/2009-04-13/Dispatches

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Discussion elsewhere

[edit]

I just saw the trainwreck that is this page. Ugh. --jbmurray (talkcontribs) 11:29, 9 March 2009 (UTC)[reply]

There was a big problem on a FAC a few months ago (if you need the example, it would take me some time to find it because I can't remember the article name, also an attempt at serious discussion of the problem at WIAFA never got off the ground, strangely, considering other concerns at WIAFA), but I do hope the article highlights that the issue is big at DYK because of the "reward" potential of quickly expanding an article to earn a DYK. I believe there were some very long threads a few months back at DYK talk. SandyGeorgia (Talk) 13:50, 9 March 2009 (UTC)[reply]
Sandy, were you referring to this? Dabomb87 (talk) 22:53, 9 March 2009 (UTC)[reply]
In the discussion linked in SG's post, I had this in mind. (Look at the final Oppose.) But I have noticed it in other FAs, and it seems to be a particular problem at DYK. Kablammo (talk) 02:16, 16 March 2009 (UTC)[reply]

Another trainwreck of a discussion that shows how few people understand what plagiarism is. --jbmurray (talkcontribs) 06:25, 11 March 2009 (UTC)[reply]

Can this be ready to go by Sunday (March 15), or should it be put off for another week? SandyGeorgia (Talk) 23:33, 12 March 2009 (UTC)[reply]
It would be tough for me alone to do it by Sunday. I'm off tomorrow on a rather mad two-day trip to the UK. --jbmurray (talkcontribs) 23:51, 12 March 2009 (UTC)[reply]
On my way home. This week will also be busy, I'm afraid, as I catch up on what's not been done while I've been away. --jbmurray (talkcontribs) 01:56, 17 March 2009 (UTC)[reply]
It can be put off for another week; better to do it right! Have a good trip home! SandyGeorgia (Talk) 01:57, 17 March 2009 (UTC)[reply]
Sorry! I didn't even see this discussion. I'm on break right now, so I can work on this pretty hard. Awadewit (talk) 01:43, 19 March 2009 (UTC)[reply]

Concerns

[edit]

There have been a number of discussion on the use of public domain material on Wikipedia. There may be hundreds of articles on ships which are, or originated as, verbatim copies of entries from the Dictionary of American Naval Fighting Ships (DANFS). (USS Franklin (CV-13), mentioned on the Main Page in On this day... for 19 March 2009, is one example.) The use of verbatim PD text for such articles, without quotes or inline cites, but with a note at the bottom of the text or in the reference section, has not been considered to be plagiarism. An attempt last fall to exclude new PD-copied articles from eligibility for DYK went nowhere, in part because of the reliance on DANFS by creators of new articles who want DYK credit. Kablammo (talk) 02:14, 19 March 2009 (UTC)[reply]

I consider this a problem, as PD does need to be enclosed in quotes. Awadewit (talk) 02:18, 19 March 2009 (UTC)[reply]
There will be a lot of resistance to that, as the practice is so ingrained. It is used even by established, competent editors who have created other fine articles without the incorporation of PD text. Kablammo (talk) 02:47, 19 March 2009 (UTC)[reply]
Here are two discussions:
It's beginning to come together, but the article doesn't yet mention the significant problem of plagiarism at DYK, partly due to the "reward" factor of DYKs. Can someone add a paragraph, linking the two discussions above? SandyGeorgia (Talk) 15:14, 3 April 2009 (UTC)[reply]
Discussion of "reward factor" as a red flag to initiate a check for plagiarism is a good idea, but I'm not sure it's going to be necessary or productive to "call out" a particular process or editor (quickly determined from the article link). Although certainly not your intent, it could be misinterpreted as chastising. Эlcobbola talk 15:55, 3 April 2009 (UTC)[reply]
Is this close? I'm attempting not a single anything/one out and maintain the assumption of good faith. Эlcobbola talk 16:06, 3 April 2009 (UTC)[reply]
I'm not too keen on calling out DYK. I think that one reason plagiarism is caught at DYK is because it is one of the few places on Wikipedia where source material is actually checked against the article (we don't even do that at FAC). I'm also not keen on complaining about the "reward factor" - that is a much larger problem which contributes to many kinds of poor writing and sourcing. It is not specific to plagiarism. Awadewit (talk) 02:24, 5 April 2009 (UTC)[reply]
I have come across several clear instances of plagiarism at DYK; I have seen it at FAC. DYK is a high-volume, sometimes last-minute process, where only the referenced hook is checked. There is no checking of source material beyond that, unless a reviewer takes on that task. We don't routinely fact-check FAC, and with reviewers spread so thin it's not likely to occur. And the "reward factor" definitely contributes to the problem. Kablammo (talk) 02:47, 5 April 2009 (UTC) I agree however that it would be best not to specifically "call out" specific examples, projects, or areas-- the issue goes beyond DYK, as my comment indicates. But there is competition for DYK noms and points in a contest; there is also the perception that an FA is helpful at RfA. We also have a long-term tolerance and acceptance of reusing PD material without quotes, including US gov't sources, 1911 EB, Catholic Encylopedia. Our approach should not be to criticize what has been done-- even encouraged-- in the past, but to establish that Wikipedia should move beyond that. Many of the editors who use PD text without quotes, and only general attibution, are perfectly capable of creating excellent content, and often do so. Kablammo (talk) 17:24, 5 April 2009 (UTC)[reply]

(out indent) Do you think we've struck the right tone in the dispatch or is further refinement required? Awadewit (talk) 19:16, 6 April 2009 (UTC)[reply]

I think so, but I have not studied all of it, and may be too close to parts of it. I hope Tony1 will make a complete pass to see how it coheres. Kablammo (talk) 19:22, 6 April 2009 (UTC)[reply]

Suggestions

[edit]

One way to minimize the tendency to reuse text, is not to copy and paste it on one's screen, as the basis for a working draft. (There have been new articles which obviously were cut and pasted from recent obituaries, and then reworked, often very lightly.) Printing out internet sources, assembling and organizing them, and then writing a draft, reduces the temptation (and makes it harder to) adopt verbatim language from the source. Kablammo (talk) 02:14, 19 March 2009 (UTC)[reply]

I agree - could you add this into the document? Awadewit (talk) 02:18, 19 March 2009 (UTC)[reply]
Yes, but without pride of authorship-- rework it as you see fit. Kablammo (talk) 02:21, 19 March 2009 (UTC)[reply]
Awadewit, looking at your changes (which I surely invited!), I think we are talking about two different methods. I was suggesting the old method (and I may be revealing something about my age here) of assembling printed materials, segregating them by subject matter, then segregating within each pile, and creating text with appropriate transitions, selective quoting, etc. You are, I believe, speaking of an electronic analogue to that process. I'm concerned the revisions may conflate the two. What we want to avoid is a process where, for examples, an on-line biography, obit, or account of an event is copied onto a screen, the editor goes through it and changes punctuation, some words, and perhaps reorders some content, decorates it with a few other tidbits or sources, and submits it. To me it is easier to avoid that if one does not work from an electronic text to start with, but I recognize that is not the only way. Kablammo (talk) 17:32, 5 April 2009 (UTC)[reply]
The process you describe is one I do electronically and it is an excellent way to avoid plagiarism. Could you reword the section of the dispatch so that our recommendation about organizational methods (either in print or online) is made clearer? Awadewit (talk) 17:53, 5 April 2009 (UTC)[reply]
I'll give it a try, but you may have to do the electonic part, while I'll focus on the method used by a dwindling pool of Luddites. Kablammo (talk) 17:55, 5 April 2009 (UTC)[reply]
A-- I took a crack at it and handled it in one place. Refine as appropriate. Kablammo (talk) 18:11, 5 April 2009 (UTC)[reply]
I like what you've done - I've just tweaked it a bit. Awadewit (talk) 19:14, 6 April 2009 (UTC)[reply]

Foreign languages

[edit]

Would it be helpful to mention that even translation is not sufficient to avoid plagiarism, as it does not resolve the issue of "use[ing] ... ideas of another author and representat[ing] ... them as one's own original work"? Further, and in the same vein, it might be worth mentioning that we can essentially "plagiarize ourselves" by translating articles across projects. If I recall correctly, this was an issue with a 1964 Gabon coup d'état FAC. Эlcobbola talk 00:15, 20 March 2009 (UTC)[reply]

Good idea. (As somebody who's just had to speak to two students who handed in essays with more or less significant instances of plagiarism, I feel a bit up to my ears with this. And the real problem for someone whose students often have to turn in essays in another language, is that "translation plagiarism" is impossible to detect by googling.)
I'm less sure, however, about plagiarism across projects or within Wikipedia, mostly because the notion of authorship is quite different on a Wiki. The problem with 1964 Gabon coup d'état was not plagiarism, but rather than the translators of that article had not checked sufficiently the sources upon which the original relied. --jbmurray (talkcontribs) 02:06, 20 March 2009 (UTC)[reply]
I gather that attribution and some link to the article history is neccessary to comply with section 4 I of the GFDL (so we don't plagiarise other wikis) - you can use {{Translated page}} to easily do that with translated pages (an example is Seraphita). --Malkinann (talk) 15:51, 22 March 2009 (UTC)[reply]

Current example

[edit]

See Wikipedia:Featured article candidates/Federal Bridge Gross Weight Formula. SandyGeorgia (Talk) 18:20, 21 March 2009 (UTC)[reply]

Thank you! Awadewit (talk) 15:05, 22 March 2009 (UTC)[reply]

Timing ?

[edit]

Is there any chance of this being completed for Monday, March 30 ? (Otherwise, I'm going to have to pull something out of a hat.) SandyGeorgia (Talk) 12:50, 28 March 2009 (UTC)[reply]

Is anyone else working on this? I can't be done by tomorrow, I'm afraid. This needs much more work. Awadewit (talk) 00:32, 30 March 2009 (UTC)[reply]

Thoughts

[edit]

Hi. :) SandyGeorgia mentioned this to me, and though I don't do as much with plagiarism (by a long shot) as I do with copyright infringement, I wanted to drop in and see if I could offer input. One thing I noted that could be contentious is this: "Very often plagiarism is accidental or inadvertent—it is still plagiarism." I agree with this, as it fits into my understanding of plagiarism, but I have learned that this is not a universally held opinion. In fact, it's fairly hotly contested here by one individual, and I recently ran into another on (I think!) ANI who strongly voiced similar concerns. I don't know how widespread that debate is, but I bring it up in case it's worth a footnote or expansion. I have no experience whatsoever in the writing of the FCDW. :)

Are you aware that there is a Wikipedia essay on close paraphrasing? It's a rather young essay, but might prove helpful. --Moonriddengirl (talk) 15:08, 30 March 2009 (UTC)[reply]

Since I teach writing at the college level, I am quite jaded about plagiarism. I do not believe that it is really done unknowingly by anyone in the US over the age of 14 or so. However, on Wikipedia, we should "assume good faith" because we don't know the educational background of most of the editors. Besides, presenting this dispatch in the best possible terms to plagiarizers out there ("we know you didn't mean it - we'll give you a second chance") is probably politically savvy. :) Awadewit (talk) 02:32, 5 April 2009 (UTC)[reply]

Some overhauling

[edit]

Hi. Apologies if I've stepped on any toes with my revisions here. I hope that the changes I've got will seem to be in a constructive direction. :) I'm hoping to get some feedback and also to find out if this should address handling problem plagiarists. Is it worth mentioning that repeat offenders may require community intervention? --Moonriddengirl (talk) 23:45, 2 April 2009 (UTC)[reply]

I agree that it is looking very good, thanks mainly to the efforts of you two. Question: given its length, would a short précis or executive summary at the outset be appropriate? Kablammo (talk) 18:16, 5 April 2009 (UTC)[reply]
I am of two minds about a summary: A summary would be good as it would repeat key points, but it might also discourage readers from reading the entire dispatch. I'm not leaning strongly one way or the other. Awadewit (talk) 19:17, 6 April 2009 (UTC)[reply]
I added a summary sentence at the outset, by this edit. Kablammo (talk) 19:26, 6 April 2009 (UTC)[reply]

←I think that may need some work: "Wikipedia editors should create their own articles, not adopt, adapt, or rewrite the work of others." We actually devote a good bit of this essay explaining precisely how the work of others should be adapted. Telling them not to do that could be confusing. :) --Moonriddengirl (talk) 19:30, 6 April 2009 (UTC)[reply]

I've taken out "adapt and rewrite". If it can otherwise be improved, feel free. Kablammo (talk) 19:35, 6 April 2009 (UTC)[reply]
I'm inclined to think that I might like the opening sentence somewhat better as it was. It's punchy and to the point. I do agree, though, that the lead needs expansion. Perhaps the expansion could be placed after it is still plagiarism? --Moonriddengirl (talk) 19:41, 6 April 2009 (UTC)[reply]
I thought of using the thought to close the dispatch, but it does not really fit there. Can you take a crack at a (modest) expansion of the introduction? Kablammo (talk) 19:50, 6 April 2009 (UTC)[reply]
Okay, I've taken a stab. :) I moved your sentence a bit back and added some sourced "why we shouldn't do this" notes. (For some reason, the editor1-first and editor1-last templates aren't working properly for me in the citebook, though. :/ Can anybody see what I did wrong?) (Oh, by the way, I do believe that it is a common problem that people inadvertently bring material back to the original language; it's an aspect of cryptomnesia—to quote Kellogg on the matter, "inadvertent plagiarism [when]...a writer fails to acknowledge unwittingly an earlier source due to the failure to recognize his or her own thoughts and words as unoriginal."(Kellogg, Ronald Thomas (1999). The Psychology of Writing. Oxford University Press US. p. 85. ISBN 0195129083.) On the "original research" side of things, it has happened to me more than once, and I believe I've observed it directly both in my own college writing students and in professionals I've edited. It's always possible, of course, that at least some of them have lied. :) --Moonriddengirl (talk) 23:47, 6 April 2009 (UTC)[reply]

Disagreement

[edit]

Less-commonly known facts ... must be cited to avoid plagiarism

This is not necessarily true. That companies often elect to file executive compensation information with the SEC in DEF 14A filings instead of 10-K filings is a "less commonly known fact". It may indeed require citation to be used in Wikipedia (per WP:V, et al), but inclusion of that fact would not be plagiarism. Эlcobbola talk 14:54, 3 April 2009 (UTC)[reply]

Perhaps there's a better way to word it? California State San Marcos refers to ubiquity in determining which facts are common within a discipline and which are not. --Moonriddengirl (talk) 15:25, 3 April 2009 (UTC)[reply]
On a perhaps personal level, I disagree with the notion that any such fact (well-known or otherwise) can be plagiarized. The definition of plagiarism used in this dispatch sets forth “language” and “ideas” as the elements that may not be too closely imitated without acknowledgment of the source. “Language” (in this context - word choice) and “ideas” are creative things. Facts are not; they are supposed to be “raw” statements of truth and are devoid of creativity by nature. Perhaps I can’t argue with a reliable source, but rephrasing to ensure we are addressing the logical consequence of the definition of plagiarism would be beneficial -- id est, it is not that fact that is being plagiarized, but the [unique/original/etc] way in which it is being presented. Эlcobbola talk 15:42, 3 April 2009 (UTC)[reply]
Well, my professional background undoubtedly informs my opinion of plagiarism, but I believe that while copyright law does not recognize "sweat of the brow", journalistic ethics basically do. I know that academic ethics do. The definition of plagiarism adopted by the Council of Writing Program Administrators is specialized to instructional settings, but it says that "plagiarism occurs when a writer deliberately uses someone else’s language, ideas, or other original (not common-knowledge) material without acknowledg­ing its source."([1]) To quote Tim Roberts' Student Plagiarism in an Online World, "The concept of "common vs. plagiarized knowledge" is an issue that needs teasing out to as to clarify the debate around plagiarism.... Common knowledge is that set of shared ideas and background knowledge that defines a demographic group. Plagiarized knowledge is knowledge defined as belonging to another person and used without that person's permission and without acknowledging that other person."(Roberts, Tim S. (2007). Student Plagiarism in an Online World: Problems and Solutions. Idea Group Inc. p. 196. ISBN 1599048019.—and just as a note, there's some interesting material in that book on the history of knowledge as property.) While I agree that the thrust of this dispatch should be on avoiding plagiarizing the creative aspects of sources, I don't think we should completely omit reference to plagiarized knowledge, even though I didn't use that term in the dispatch itself. (It is not, I believe, widely adopted and may be outright owned by Roberts. :)) Emory and the Walker Cronkite School of Journalism and Mass Communication seem to me to support the idea that the journalistic code of ethics considers failure to attribute (non-ubiquitous) information as plagiarism, too, although neither includes specific language in a clear definition. --Moonriddengirl (talk) 16:47, 3 April 2009 (UTC)[reply]
I agree that the choice of examples can be plagiarized - I view this as part of organizational structure. Such things are not necessarily protected by copyright, but, ethically, one is required to say where the examples came from, right? However, the much more pressing matter is plagiarism of creative expression. I think we have struck the right balance here. Thoughts? Awadewit (talk) 19:20, 6 April 2009 (UTC)[reply]

Decision time

[edit]

Unless editors involved here tell me this is ready to run this week (April 6, Sunday night/Monday morning), I will submit another Dispatch this week, and run this on April 13. Feedback? SandyGeorgia (Talk) 17:09, 4 April 2009 (UTC)[reply]

I think it's down to fine-tuning, and I think it should be ready for this week. --Moonriddengirl (talk) 17:12, 4 April 2009 (UTC)[reply]
I can work some more on it this weekend - it should be ready by Sunday/Monday. Awadewit (talk) 18:24, 4 April 2009 (UTC)[reply]

Rethinking this: since we want this to get widely viewed, and many people may be on vacation or break this week, it might be better to hold off a week ??? SandyGeorgia (Talk) 20:05, 4 April 2009 (UTC)[reply]

Ok. We still need someone to add a section on foreign-language plagiarism. This is not my area of expertise, however. Could Jbmurray or Elcobbola add this? Awadewit (talk) 02:28, 5 April 2009 (UTC)[reply]
Elcobbola has done some writing on this in the past; perhaps we can entice him to add something? I think he already has some wording somewhere in his talk page archives. SandyGeorgia (Talk) 02:32, 5 April 2009 (UTC)[reply]
The only other thing we should probably add is something about habitual plagiarists. I think Moonriddengirl was going to add this. I've done some copyediting, so I think the dispatch is pretty solid right now. Awadewit (talk) 03:45, 5 April 2009 (UTC)[reply]
Hmm. I've been swamped by copyvio issues today (another massive infringer). I'm not sure that I'd have a lot to say about mass-plagiarists, though I do think we should mention. What's the general protocol for plagiarists? ANI? RFC? Copyright infringers require admin action, but plagiarists may be different. --Moonriddengirl (talk) 19:26, 5 April 2009 (UTC)[reply]
I don't think there is a general protocol - that is part of the problem. Awadewit (talk) 19:29, 5 April 2009 (UTC)[reply]
I suggested both. Briefly addressed; open for revision and improvement. :) --Moonriddengirl (talk) 00:15, 6 April 2009 (UTC)[reply]
Do you think we should mention that there isn't a general protocol laid out in an accepted policy yet and note this as a problem? Awadewit (talk) 19:21, 6 April 2009 (UTC)[reply]

Discussion after "publication"

[edit]

When this piece is "published", will this talk page go with it? If so, it might be a good idea to archive it, discussions prompted by the publication can start afresh. (Discussions here of course will still be accessible in the archive.) And should individual processes and projects be notified, and, if so, how? Kablammo (talk) 18:27, 5 April 2009 (UTC)[reply]

Yes, the talk page goes with, and we could archive this page just before publication. Kablammo, since you are (I believe) the editor here who has been most involved with DYK, perhaps it would be helpful if you presented it with an introductory blurb at DYK? Someone else may want to present it at FAC and elsewhere? SandyGeorgia (Talk) 18:47, 5 April 2009 (UTC)[reply]
I shudda kep ma big mouf' shut. I see that Awadwit is pretty active there now-- I'm a recovering DYKer. Kablammo (talk) 18:54, 5 April 2009 (UTC)[reply]
I noticed Mattisse is already linking people to this dispatch over at DYK! :) I'll drop a note over there and at FAC. Who wants to take GAN and PR? Other places? Awadewit (talk) 19:22, 6 April 2009 (UTC)[reply]
eeeek. Can we hold off until it's published, and Tony has gone through? SandyGeorgia (Talk) 21:15, 6 April 2009 (UTC)[reply]

Scheduled to publish at The Signpost on April 13: Wikipedia:Wikipedia Signpost/Newsroom; it would be good to have it finished by Friday the 10th. SandyGeorgia (Talk) 00:18, 7 April 2009 (UTC)[reply]

Personally, I think it's good to go. Of course, something like this can always be micromanaged eternally, but it reads smoothly and it seems to me to cover all major elements with proper focus on each. I've enjoyed collaborating on it. :) --Moonriddengirl (talk) 12:42, 9 April 2009 (UTC)[reply]
A couple of fine points (sorry for the micromanaging):
  • Plagiarism and copyright infringement: "If this was indeed the reaction of Wikipedia editors, they were mistaken." Is there a source we could cite which would directly support the assertion that use of PD sources is plagiarism?
  • What to cite: the "common knowledge" exception: "Accordingly, while text such as 'Dickinson was born on December 10, 1830' can be copied without quotation marks, care must be taken." This seems to be an incomplete thought. Can it be expanded to state the point directly?
  • Addressing plagiarism-- Would it be better to have the template before the first and second paragraphs, and/(or)with an introductory sentence, to avoid confusion?
Thanks, and I have enjoyed the collaboration as well. Kablammo (talk) 13:56, 9 April 2009 (UTC)[reply]
I've added more to the Dickinson sentence. As to the template, I suspect it's there illustratively, and as I am relatively devoid of aesthetic sense, I will leave that to somebody else to address. :) I can see the potential confusion, though. I wonder if we could enclose it in a colored box or something? Take a screen cap and tuck it to the side? With respect to Tony's hidden note about "Atypical elegance", I am inclined to agree and am going to try merging it to the next red flag. If it doesn't work for people, I naturally have no objection if somebody restores what we had while we discuss different approaches. --Moonriddengirl (talk) 14:20, 10 April 2009 (UTC)[reply]
As I sat down to try to actually merge this, it seemed to me that it was probably already implied in "hasty construction" and "inconsistent authorial voice". I did add a specific reference to "atypical elegance" in the latter, but didn't merge any of the material. Thoughts? --Moonriddengirl (talk) 14:24, 10 April 2009 (UTC)[reply]
The template is there illustratively - I don't know how to put it in a box, though. I disagree very strongly about merging "atypical elegance" - that is, in fact, how I catch most plagiarism. The bulk of Wikipedia is very poorly written and I would never make the claim that Wikipedia is "moving towards professionalism". Awadewit (talk) 14:26, 10 April 2009 (UTC)[reply]
All right. I've unmerged it. :) Is there some way that we can tweak this so that we don't create paranoia where more professional should be expected, as with heavily polished articles? I would say experienced contributors, but, sadly, I've had cause lately to know all too well that "experienced" doesn't mean "not going to infringe/plagiarize". :/ --Moonriddengirl (talk) 14:30, 10 April 2009 (UTC)[reply]
We could list examples of where excellent writing is expected (such as FAs), but note that this does not always mean that plagiarism doesn't exist. Awadewit (talk) 14:32, 10 April 2009 (UTC)[reply]
No, though it would make the elegance of the writing less of a red flag in itself. Shall I try to come up with some language for that, or would you like to? --Moonriddengirl (talk) 14:35, 10 April 2009 (UTC)[reply]
I have to leave to go teach right now, so I will be gone for several hours. If you could do it, that would be great. Awadewit (talk) 14:50, 10 April 2009 (UTC)[reply]
Have fun. :) (I miss teaching. I haven't been in the classroom in years. :/) --Moonriddengirl (talk) 14:54, 10 April 2009 (UTC)[reply]
Argh! I was asked for a "10 minute" version of Hindus vs. Muslims in India and was rather unprepared. :( Working on that this weekend. Awadewit (talk) 23:35, 10 April 2009 (UTC)[reply]

Attribution

[edit]

In the past several weeks I have seen an EB1911-based article (John Byng) and a DANFS article (USS Franklin (CV-13)) attract attention because of links from WP:On this day. In each case there was criticism of the tone of the borrowed text (NPOV), and changes to it. Where the original text is not directly cited, but covered only by a template indicating the PD source at the bottom of the article, and uncited changes are made to that original text, does that cause any issues with attribution or license? I recognize this is too fine a detail to deal with in this dispatch, but the answer might affect the discussion. Kablammo (talk) 19:44, 5 April 2009 (UTC)[reply]

Oh, now, there's a whole new kettle of fish. :) It's touched on at the existing plagiarism guideline proposal, here. According to that guideline, the use of this material is a bit contentious anyways. That guideline recommends placing the pd text in one fell sweep so that its attribution is clear in the history of the article, just as the attribution of Wikipedia's own contributors is. This is a kind of specialized problem, obviously, arising from the fact that our articles do not have by-lines of their own. If they did, then all copied text would require quotation marks, just as they do in academics, where it doesn't matter if the source is pd or otherwise. I am personally inclined to think that it may be best to leave this one for resolution to that proposal. --Moonriddengirl (talk) 20:14, 5 April 2009 (UTC)[reply]
Thank you for that clarification. As your links show, PD text is often used as the base for the article, but is modified before publication, so the article which appears is an olio of both the PD source and the editor's work, confusing the attribution. I understand that subsequent changes are reflected in the edit history. Kablammo (talk) 14:00, 9 April 2009 (UTC)[reply]

Adaptation examples

[edit]

Tony wonders if we can lose one of the examples for length concerns, [2]. I'm inclined to agree. Both are good examples, but cover similar ground. Since example 2 has more analysis, I've cut out example 1, but retained some of the language, which I quite liked. I've also removed the level five header by replacing it with whatever you call that thing that happens when you put ";" before text. :) I've also wikilinked to the article being used as an example. We don't want to embarrass anybody, obviously, but there are copyright issues with reproducing the text here without attribution.

Thoughts? --Moonriddengirl (talk) 11:44, 7 April 2009 (UTC)[reply]

That is fine with me. My biggest worry with linking to the article is that it would highlight one particular editor's plagiarism, but I do think we need specific examples. Awadewit (talk) 18:15, 8 April 2009 (UTC)[reply]
I share your concern about embarrassing any particular contributor. We could substitute an example from a contributors whose plagiarism is already well-known and well-established? I seem to recall that a contributor working on science or perhaps botany related articles was brought to ANI for plagiarism. Obviously, I don't remember many details. :) But if we used an example from the contribution history of such a contributor, there would be no potential further damage to reputation? --Moonriddengirl (talk) 14:05, 9 April 2009 (UTC)[reply]
A repeat offender would be better, yes. Does anyone know where to find this example? Awadewit (talk) 17:06, 9 April 2009 (UTC)[reply]
I've put out some queries, but my recollection (perhaps faulty) is that it wasn't an ArbCom, rather a long discussion at ANI. We could ask at Wp:ANI? SandyGeorgia (Talk) 17:08, 9 April 2009 (UTC)[reply]
Is this it? Wikipedia talk:WikiProject Gastropods/Subpage for organizing CopyVio Cleanup Awadewit (talk) 17:12, 9 April 2009 (UTC)[reply]
I believe so ! SandyGeorgia (Talk) 17:17, 9 April 2009 (UTC)[reply]
No, that's not the one I meant. That was a straight-up copyright violation, not a plagiarism issue, and the examples we'd have are straightforward pasting. I'll see what I can find. :) --Moonriddengirl (talk) 17:18, 9 April 2009 (UTC)[reply]
Was it editor Sadi Carnot (sp?) or socks? We could review ArbCom archives if we had a clue ... SandyGeorgia (Talk) 17:23, 9 April 2009 (UTC)[reply]

←I know it's incredibly vague. :/ Sorry. I may not be remembering correctly at all, since it's been a while. I'm thinking it was some science-related field. I'm currently doing a google search of Wikipedia to see what I can find for "ban + plagiarism". Maybe it'll turn up. If not, maybe something else will. :) --Moonriddengirl (talk) 17:26, 9 April 2009 (UTC)[reply]

Just to update, I've found an example where an external source plagiarized on Wikipedia. Some of the duplication is verbatim, but some is not: User:Ydorb/khobar-copyvio, Wikipedia:Wikipedia Signpost/2007-11-19/Khobar plagiarism. If needs must, we could probably craft an example from this. Meanwhile, still looking. :) --Moonriddengirl (talk) 17:45, 9 April 2009 (UTC)[reply]
I think that would confuse matters. We really need an example of contributors plagiarizing. Awadewit (talk) 17:56, 9 April 2009 (UTC)[reply]
Wikipedia:Wikipedia Signpost/2006-10-30/Plagiarism cleanup: this one (history still intact) offers a little revision of the source. It also generates this list, which I'm examining now. --Moonriddengirl (talk) 18:00, 9 April 2009 (UTC)[reply]

← Moonriddengirl: I think this may be the incident of plagiarism that you're thinking of. Please to enjoy. :) MastCell Talk 18:01, 9 April 2009 (UTC)[reply]

Whoot! Yes! That's it! Thank you. :D I'll go see if there's something usable. --Moonriddengirl (talk) 18:05, 9 April 2009 (UTC)[reply]
I have not yet found any examples in her contrib history (deleted or otherwise) that are quite as good as the one we currently have. :/ It seems we either compromise on quality or risk embarrassing the contributor of a sample like the one currently in use, if people trouble to track it down. Thoughts? --Moonriddengirl (talk) 19:05, 9 April 2009 (UTC)[reply]
I think the one we have is particularly good, because it is the kind of adaptation that people might not consider plagiarism. How about we archive the talk page of the article, so that the editors involved are not so obvious? Awadewit (talk) 19:09, 9 April 2009 (UTC)[reply]
The three listed under Current examples (above) were already highlighted at FAC. Archiving the talk page would be good, but is everything covered ? For example, foreign language translations? And a citation for no copying of PD text? SandyGeorgia (Talk) 19:12, 9 April 2009 (UTC)[reply]
I'll bring in foreign language translations. Citation for not copying PD text? Do you mean, specifically mentioning that it's still plagiarism to steal the ideas of somebody in public domain? --Moonriddengirl (talk) 19:14, 9 April 2009 (UTC)[reply]
On PD text, I'm referring to this commentary: [3] On foreign languages, there's discussion at User talk:Elcobbola. SandyGeorgia (Talk) 19:16, 9 April 2009 (UTC)[reply]

←I've sourced PD. I've added a brief note on foreign language texts, though I did it in the section on spotting plagiarism. I would like to find a more clear citation, but haven't had luck. The one I've got I'd have to reproduce practically a full paragraph to incorporate the word "plagiarism" in conjunction with the problem. Thoughts? --Moonriddengirl (talk) 19:54, 9 April 2009 (UTC)[reply]

Inlines to be resolved

[edit]

I'm putting the remaining inline queries here, as they can be hard to follow:

  • If the article has a multi-authored feel but appears to be largely single-authored, there could be reason for concern< !--Why?-->.
  • that are added to the top of suspect section or article and may draw attention to the problem; concerns might be noted at an appropriate forum< !-- which is ? -->

SandyGeorgia (Talk) 14:27, 10 April 2009 (UTC)[reply]

I've answered the first one, I hope, though the language may need tweaking. As to the second: good question. Right now, I would myself go to Wikipedia talk:Plagiarism. It's not in the scope of Wikipedia:WikiProject Copyright Cleanup. But a glance at the talk page of that proposed guideline tells me that a request is unlikely to be noted there. Unless somebody else can think of a good forum, maybe that needs to be dropped. :) --Moonriddengirl (talk) 14:37, 10 April 2009 (UTC)[reply]
Am I understanding correctly that we currently have no forum for plagiarism cleanup, because it's outside of the scope of Copyright Cleanup? Wow. Then I guess we have to drop it. SandyGeorgia (Talk) 14:45, 10 April 2009 (UTC)[reply]
So it seems. :/ We can suggest Copyright Cleanup. It doesn't have a large membership yet, but I imagine that the members there would be happy to look at plagiarism issues even if it's not in the mission statement; there's overlap, after all, since plagiarism and copyright violations often do go hand-in-hand. --Moonriddengirl (talk) 14:47, 10 April 2009 (UTC)[reply]
I had suggested earlier that we explicitly state that one of the problems of not having an accepted guideline is that there are no protocols for plagiarism - another problem is there are no projects. Awadewit (talk) 14:49, 10 April 2009 (UTC)[reply]
Given clear guidelines on plagiarism, it would seem natural to make addressing it a part of Copyright Cleanup, since there often is overlap. I would love to see the plagiarism guideline adopted by the community. --Moonriddengirl (talk) 14:53, 10 April 2009 (UTC)[reply]
So, should we address this directly (state that there is no current forum or policy, but suggest Copyright Cleanup)? Also, {{Close paraphrase}} seems to be the better template to illustrate plagiarism; should we switch the sample template? SandyGeorgia (Talk) 14:56, 10 April 2009 (UTC)[reply]

←I've taken a stab at it. I had switched the template, but switched it back. {{Close paraphrase}} was written in service to copyright issues and specifies that the source is not free. {{Copypaste}} at least covers both free- and non-free text. Unless the Close paraphrase template is modified, along with the essay it links to, we might want to use the one that covers both. I have pasted the code directly rather than adding the template because we don't want this page to be listed for clean-up with the automatically added categories. Unless pages with this prefix are exempt? --Moonriddengirl (talk) 15:19, 10 April 2009 (UTC)[reply]

Looks like these two sections are all wrapped up now; shall I archive them now? SandyGeorgia (Talk) 20:00, 10 April 2009 (UTC)[reply]

If others are happy, I'm happy. :) --Moonriddengirl (talk) 20:01, 10 April 2009 (UTC)[reply]
I'm happy. Awadewit (talk) 23:40, 10 April 2009 (UTC)[reply]

Nice essay

[edit]

This is a very helpful essay. Perhaps some of it may make its way into the various related "guidelines" like "How to write a great article". I hope the Dispatches section continues to concentrate on content and quality, instead of featuring interviews of people about their work on Wikipedia. One recent Dispatches interview focused on an editor who I consider to be counterproductive to the entire project, a bully in discussions, and who basically only writes stub and start articles.[Oops, sorry, I was thinking of the project report section of Signpost. -- Ss] So, seeing this good essay by this distinguished group of content experts is very, very welcome. Thank you! -- Ssilvers (talk) 17:45, 13 April 2009 (UTC)[reply]

Thanks! Awadewit (talk) 19:00, 13 April 2009 (UTC)[reply]

Thanks from me too. Glad that the proposed guideline (Wikipedia:Plagiarism) was mentioned in passing (thought for a moment it had been left out!). I've added a link here from there, and hopefully what has been written here will be incorporated in some form over there. If anyone wants to have a go at reconciling any differences, that would be even better! Carcharoth (talk) 19:34, 13 April 2009 (UTC)[reply]

Signpost article takes an extreme position here without crediting other points of view

[edit]

The Signpost article reads: "If an article seems to follow the language and structure of another work too closely, first consider whether it is a matter of copyright infringement or plagiarism. . . If the source is free, steps should be taken to remedy plagiarism. Wikipedia's proposed guideline on plagiarism suggests politely discussing concerns with the contributor. Further steps may need to be taken to address contributors who persist in plagiarism after being made aware of the problem, through Requests for comment or—if the contributor proves disruptive—through a report at the administrator's incidents noticeboard. The plagiarism will also need to be repaired as soon as possible."

Yet even the proposed plagiarism guideline accepts that attributed imports of PD material are not considered plagiarism: "Assuming that some type of public domain material is available and welcome, a good practice to use when copying free content verbatim is to indicate in the edit summary the source of the material. . . If you do choose to use verbatim material from a public domain source, you should attribute it properly."

Asking people to "remedy" these imports and labeling such contributions as "plagiarism" is ridiculous. That the above opinion is published without acknowledging there is any dissent nor that consensus and long-standing practice at Wikipedia accepts such imports leaves me speechless. Contributions such this does not make me a plagiarist.

Please correct this article to include the long-standing acceptance of PD imports as detailed by the proposed guideline you quote elsewhere in article.--BirgitteSB 20:50, 13 April 2009 (UTC)[reply]

I'm sorry that there is difference between this document and the proposed guideline on "plagiarism." That language copied from public domain sources is regarded as plagiarism is supported by footnote #4, "To avoid charges of plagiarism, authors of scholarly works ... always give proper credit to the sources of their ideas and facts, as well as any words they borrow. This is so even if the work borrowed from is in the public domain."(Fishman, Stephen (2008). Public domain: how to find & use copyright-free writings, music, art & more (4th, illustrated revised ed.). Nolo. p. 35. ISBN 1413308589.) I would imagine one of the reasons that the guideline on plagiarism has not yet been adopted more than a year after it was created is that the definitions of and handling of plagiarism remain hotly disputed, even among editors who are all motivated by a desire to keep Wikipedia free of it. --Moonriddengirl (talk) 21:05, 13 April 2009 (UTC)[reply]
My issue is not that such a difference exists, although the authors willingness to cherrypick from it is misleading. My issue is that the Signpost article pretends that the dispute you acknowledge does not exist but rather presents only one point of view on the issue. I understand that there are people who hold the view supported by the article, but to present only that extreme viewpoint followed immediately by such strong language as I quoted is against the spirit of Wikipedia.--BirgitteSB 21:29, 13 April 2009 (UTC)[reply]
As far as I know, this view is quite common. I am one of the authors of this document, and it certainly wasn't my intention to "cherrypick." While searching sources, I promise, I didn't personally pass up anything that said that using language from a public domain source was not plagiarism. (In fact, I have myself been guilty of plagiarism on Wikipedia under this definition, though certainly not through any intention to deceive.) I'm not the sole author of this document, but I would have no problem with acknowledging that there is an alternative point of view on Wikipedia, although I would not be comfortable with giving them equal weight, so to speak, without some reliable source indicating that using such text isn't considered plagiarism amongst academic circles. This is the first Dispatch I've ever been invited to work on, but so far as I understood, it is an essay based on the perspective of its named authors. Since Wikipedia's practices will be judged by a much larger community than our own, I think reflecting the judgment and value system implemented by that larger community is best, whatever reliable sources may document it to be. --Moonriddengirl (talk) 21:40, 13 April 2009 (UTC)][reply]
(ec)I didn't personally pass up anything that said that using language from a public domain source was not plagiarism. That is not my position. Of course importing PD text without attribution is still plagiarism. The sources you link to here are really geared toward students and authors, so I can understand that they have no comment on the situation of editors or compilers. But I cannot believe that you saw no sources which limited their definition of plagiarism to works where one is claiming some amount of authorship. To put the definition of plagiarism in context I will quote my "Writer's Reference" (also by Diana Hacker ISBN0-312-13417-7) with the surrounding context
While I do not doubt that your preferred definition is valid for research papers and anything else which one claims authorship on, a blatant acknowledgment that one is wholesale copying a PD source as an editor is completely different application. And it is uncommon to consider wholesale copying with attribution as plagiarism when the copier is not claiming any authorship for the work. I have come across no sources which describe a situation where an editor compiling attributed sources in a separate work as plagiarism. Yet examples of anthologies compiled by someone credited as an editor are common. Every definition of plagiarism is given within context of authorship. Dumping a PD text verbatim as an article is not authorship.--BirgitteSB 23:39, 13 April 2009 (UTC)[reply]
"failing to enclose borrowed language in quotation marks" seems to be precisely what we're talking about. Perhaps I've misunderstood the nature of your concern? You say, "attributed imports of PD material are not considered plagiarism" at the proposed guideline. This document doesn't regard them as plagiarism, either, as long as borrowed language is enclosed in quotation marks and summaries/paraphrases are properly rewritten. Is there something in this essay that inadvertently suggests otherwise? --Moonriddengirl (talk) 23:45, 13 April 2009 (UTC)[reply]
(ec - to Birgette). Sure, but what happens if someone else comes along and modifies the imported PD article? Does it matter if they only change it a bit or a lot? At what point does it become joint authorship of a new article and what point does the attribution need to change and how do you separate out the attributions needed? This is the fundamental difference between wikisource and wikipedia. Wikisource preserves the PD text verbatim. Wikipedia edits it mercilessly. Carcharoth (talk) 23:49, 13 April 2009 (UTC)[reply]
Reading this comment makes me suspect that the remark by Birgitte I've failed to follow is the more recent. I gather that the quote was placed not to support an alternative view but simply to verify that in some circumstances plagiarism does requite quotation? I can't see how that helps verify the opposite, but at least it leaves me less lost. :) Birgitte, I assure you that I encountered no source anywhere that distinguished definitions of plagiarism or requirements for quotation marks/proper paraphrase/attribution of public domain text with respect to the purpose of editing versus authorship. That doesn't mean they don't exist, only that I haven't seen them. I would also say that Carcharoth has a good point in that Wikisource retains the full integrity of the original and readers know beyond doubt that the attributed author wrote every word, whereas a Wikipedia article does not. --Moonriddengirl (talk) 00:04, 14 April 2009 (UTC)[reply]
No it is to verify plagiarism requires claims of authorship. Your sources on the definition are within the context of claiming authorship. What happens afterward does need guidance, however that is a separate issue. This article reads ""If an article seems to follow the language and structure of another work too closely, first consider whether it is a matter of copyright infringement or plagiarism. . . If the source is free, steps should be taken to remedy plagiarism." Suggesting labeled PD imports qualify as plagiarism which is in accurate. And my big problem is that the article omits that this is disputed and then follows strong language such as labeling editors as disruptive.--BirgitteSB 00:17, 14 April 2009 (UTC)[reply]
This dispatch labels no editors disruptive. It says, verbatim, "Further steps may need to be taken to address contributors who persist in plagiarism after being made aware of the problem, through Requests for comment or—if the contributor proves disruptive—through a report at the administrator's incidents noticeboard." Not only does it not label any editors as disruptive, but it explicitly allows that RfC is the forum of first resort. (Editing to add: second resort, actually. First resort is polite discussion.) Citing a definition of plagiarism for circumstances wherein authorship is presumed doesn't verify that plagiarism does not exist where it isn't. Additionally, Wikipedia's contributors are authors. We are not simply anthologizers, but produce original text that is—so long as we do our job properly—granted its own copyright. --Moonriddengirl (talk) 00:28, 14 April 2009 (UTC)[reply]
The steps to be taken including labeling continuing such editing as disruptive immediately follows the bit about free content. Sorry but I find it hard to understand that even though this article is written only acknowledging the overly inclusive interpretation of plagiarism the bit on remedies is intended to use a different interpretation of plagiarism (an unknown one that the community will hypothetically agree through future RFCs) when referring to it as disruptive. At the very least, you must concede that the article can be read as I interpreted it and could be clarified as you intended otherwise. Honestly I think you believe in your opinion and are trying to scare people straight here. I would even agree with the strong language combined with a more accurate analysis of beliefs on the topic. But you cannot pair that kind of strong language with non consensus interpretations and not expect to hear objections. Contributors are only authors when they are authors. They are mostly editors though. You know enough about copyright to know were that line is so I will spare you the details. If there were any doubt about that, the "History" of contributions has been definitively acknowledged as not equivalent to "authorship". If the licensing debate has any settled points that is one of them.--BirgitteSB 00:58, 14 April 2009 (UTC)[reply]
Well, I can certainly see why this topic is contentious. Please point me to a reliable source that says public domain material does not have to be cited and quoted the way every other source does. The only place I have ever seen this argument is on Wikipedia. It seems that it was an argument adopted for expedience which has now become entrenched. It goes against everything I have read on plagiarism and writing ethics (I teach writing at the college level). However, if there are other sources out there, I am prepared to listen to them. I don't think we should be debating this issue without them, however. Awadewit (talk) 01:07, 14 April 2009 (UTC)[reply]
What does that request have to do with plagiarism? I quoted a source which showed how plagiarism is defined within the context of claiming authorship. This is the point. Attributed copying without claiming any credit of authorship is not plagiarism. Show me a source that states otherwise.--BirgitteSB 01:29, 14 April 2009 (UTC)[reply]
That is not relevant. You are claiming that Wikipedians can copy text word-for-word and that they have no obligation to put quotation marks around that material - that is a stunning claim. I am asking for a source that explains why PD text should be treated any differently. That is the crux of the issue. Awadewit (talk) 01:33, 14 April 2009 (UTC)[reply]

(unident) I am claiming the following: Plagiarism is defined within the context of claiming authorship and attributed copying without claiming any credit of authorship is not plagiarism. Nothing more nothing less. This is not plagiarism and neither is this this Nor is any similar edit to a privately owned wiki plagiarism. I am not your student claiming it was my work. I am not claiming any sort of authorship at all. Wikipedia is not an academic project. It is not my writing and I am not a plagiarist. You cannot simply expand the interpretation of plagiarism through your own inability to see outside the box. Not everyone is writing here. Historically alot of not-writing has been a part of the contributions here. And you can't suddenly declare everything associated with Wikipedia as writing and subject to the rules of writing merely because it what you best relate to.--BirgitteSB 01:55, 14 April 2009 (UTC)[reply]

I'm not sure how widespread the notion is that Wikipedia's contributors are not authors; I set forth the reasons why I believe they are here. Without rereading that lengthy conversation, I only remember one contributor to that conversation asserting that they were not. As to the rest, Carcharoth seemed to have some cogent points above. --Moonriddengirl (talk) 01:59, 14 April 2009 (UTC)[reply]
They certainly can be authors if they write something. But if you are implying that every person listed in the history tab is an author of the attached article; that has been definitively negated as I said above.--BirgitteSB 02:06, 14 April 2009 (UTC)[reply]
No. As I likely said in that linked page, editors who correct typos and formatting are not authors. They are copy-editors. But our articles are not collections of collaboration by copy-editors; they are creative constructs put together by one or more authors. You may not be claiming authorship, but I believe there is a common understanding (as well as a legal one, given the governance of copyright) that an article is an authored piece. --Moonriddengirl (talk) 02:13, 14 April 2009 (UTC)[reply]
Of course an article is authored. But not necessarily by Wikipedians. Some articles are authored by Wikipedians, but by no means are all. The goal has always been to distribute free, neutral, encyclopedic information. Rather than a writing exercise. Sometimes it has become a writing exercise, sometimes it is more an exercise of compilation. Finding and importing what is lacking. If I could write scripts, I would have imported the whole Silvics manual nicely formatted. Imports had been common back then. It has never been "Wikipedia the originally authored encyclopedia". To say Wikipedians wrote it all is revisionist history. I am not saying there is no reason for plagiarism concerns, but you push it too far in treating this like some University project with contributors as students working for credit--BirgitteSB 02:39, 14 April 2009 (UTC)[reply]
Actually, my thought is more along the lines of treating it like a professional encyclopedia, with similar ethical standards. Again, I have myself read no sources to indicate different standards for plagiarism there. This doesn't mean they don't exist. I won't have opportunity to visit my real library any time soon, but my various search efforts at google books since this conversation started haven't been helpful. Since leaving the ivory tower over a decade ago, my professional career has had far more to do with watching for copyright infringement than checking for plagiarism and even when I was in the ivory tower, misuse of public domain text was seldom an issue of concern among my students. In spite of your expressed opinion that I'm trying to scare people into adopting my view, my opinion on verbatim use of public domain text has been largely unformed. As I indicated above, I've used the public domain templates myself. The only references to public domain & plagiarism that I was able to find indicated that verbatim duplication of language from them without quotation marks is plagiarism. If this is the view of scholarly circles, then I imagine this is the view Wikipedia would want to take, if it is to be taken seriously in scholarly circles. If reputable, published encyclopedias are free to copy material from the 1914 Catholic Encyclopedia without drawing scorn, then so be it. I advocate living up to prevailing encyclopedia standards, not forging new ones.
In terms of Wikipedia's purpose, I have absolutely not been here from the beginning. But I rather thought that the goal had always been to write a collaborative encyclopedia, as indicated here. After all, in 2001 Jimbo seemed to indicate as much (according to our article on the history of Wikipedia, anyway, at note 20, when he said, "One is Wikipedia (www. wikipedia. com) which uses clever software to build an encyclopaedia from scratch.... At the Wikipedia, anyone can write about any subject they know about. The idea is that over time, enough experts will offer their knowledge for free and build up the world's ultimate hand-built database of knowledge." If we're simply importing the work of others, we're not building an encyclopedia from scratch or writing about subjects we know about. While WikiSource provides an excellent opportunity to compile previously published information, I liked the idea that Wikipedians might say, "we made it ourselves, fair and square." But that really does seem to be getting into Wikipedia talk:Plagiarism ground. --Moonriddengirl (talk) 12:27, 14 April 2009 (UTC)[reply]
I agree completely with Moonriddengirl here. I'm not quite sure why we are evading the ethical or the practical dimensions of this argument in order to try and pin down some elusive notion of authorship. The ethics of precisely attributing sources and clearly identifying where particular words and phrases come from is not an issue that only affects students and professors. We have a responsibility to our readers to be absolutely clear about where particular ideas and phrases are coming from. Without careful attribution and use of quotation marks, Wikipedia's collaborative editing model will obscure the source of information. Carcharoth has outlined those problems very well. Awadewit (talk) 23:30, 15 April 2009 (UTC)[reply]
(ec) In some sense, those copy-pasting a PD article from elsewhere are not editors or authors. They are just reusers, republishing PD material. You are republishing the PD text. But unlike republishing at wikisource, you are publishing into a jungle (Wikipedia) where the text can be changed at will. So you are initiating a public editing process. Now, some articles may remain largely unchanged (as with the Pignut Hickory example), but some articles change markedly from their PD-text origins. What do you think should happen in such cases? Carcharoth (talk) 02:16, 14 April 2009 (UTC)[reply]
Well there are two different issues really. What problem can happen to the text? (None) Are future editors in trouble (Only if they are dishonest) The text has no problems in the jungle. It is PD and hosted in the US. PD is open to merciless editing. Nothing problematic can happen to this text in even the most merciless jungles full of mother-eating cannibals. Unless maybe you personally edit it from France after it has figuratively devoured its original author and that author has heirs in France, but even then the text has no problems. And contributors who need to worry about moral rights, should know who they are. Away from copyright and towards plagiarism. Staying free of plagairism depends mainly on using clear edit summaries and not claiming you wrote more than you did. Most discussion of plagairism rests on the following assumption . . . and them you present the work as your own in an academic environment. or sometimes . . . and then you submit to your editor for publication with your byline. Outside of those situations it rarely comes up. But lets imagine. If someone adds a new section to Pignut Hickory on Native American Medicinal Uses, there isn't really a problem unless they hand in a the full article as schoolwork or at the very least list Pignut Hickory instead of Pignut Hickory#Native American Medicinal Uses on their userpage under "Articles I Wrote". By default anyone can see what is done in history. If it is a smaller piece of PD text you paraphrase out the intro and maybe the remaining PD text could cover a few sections with a template "The preceding section was incorporated from Foo" while additional sections are originally written. If it is less PD text than would make a complete section paraphrase and quote it all out as the Signpost article suggests. But if the text is a high quality comprehensive treatment like my example; it is not really going to be superseded. Frankly once the text is here it difficult to overstate one's authorship credit, unless maybe someone removes the attribution to the original source. The diffs reveal all. The only reason there is any question over this situation at all is because there is no really great method of assigning an authorship separate of all contributions. If you followed my earlier link, you know what I think should be done there. But overall plagiarism is about people and their reputations for integrity and it is completely irrelevant to texts. If you are dishonest about what you wrote it will follow you, it won't follow the text. The question is do we care more building up the hubris of Wikipedia as if it were some sort academic institution and being able to claim "Wikipedia wrote all this", or rather do we care more about making the most comprehensive and highest quality information free for the world to use and share? That is a completely loaded question, but go read the Wikipedian authored articles on conifers and tell me which method really moves us closer to the goal this project was founded for. How long before the all sections on Genetics are written? --BirgitteSB 03:38, 14 April 2009 (UTC)[reply]

Above thread has been unindented - bit below is in reply to OP at start of section.

Really, a discussion of the example Birgette gives above would be better at Wikipedia talk:Plagiarism, but my view is that you have to consider what a reader of the article will think. What they see is a nice article on Pignut Hickory, with an impressive list of references and interesting numbers and tables of data. The reader might even be forgiven for thinking "the Wikipedians that edited this article have done a great job tracking down those sources and compiling all the disparate information into an article". But then they might notice the bit at the bottom: "This article incorporates public domain material from the United States Government document (Silvics of North America; volume 2: Hardwoods, United States Department of Agriculture Forest Service Agriculture Handbook 654, 1990)". Hmm. So how much of the credit for the article (and plagiarism disputes are fundamentally disputes about creative credit) is due to Wikipedians, and how much to this public domain source? Looking further, we find that the changes to the articles are fairly minimal. Much of it is still the same article that was imported. Granted that the amount of change of any Wikipedia article that changes this way is hard to quantify, but in this case, it is clear that the credit for assembling the contents of the article should go to Glendon W. Smalley of the US Department of Agriculture. The question then becomes whether Wikipedia giving his name in the edit summary is enough, and whether the mention of the PD source at the bottom of the article is enough? That tag being at the top of the article might make all the difference. My view is that Wikipedia has done no more than republish and dress up the PD content, and until the Wikipedia editing process has sufficiently changed the content that the source attribution should be far more than it is. The difference here is between republishing PD material and then re-editing it (what is taking place here), and the process of summarising and paraphrasing a source and placing the paraphrased and summarised material in a Wikipedia article (the process for aggregating information from non-PD sources). And consider what happens when two PD sources are merged in an article like Pignut Hickory? You then need to tell the reader which bits came from which source. In a similar way, when only one PD source has been used, you need to say which bits of the wording came from the PD source, and which bits of the wording can be attributed to the Wikipedia process? Nothing to do with copyright (it's all free) and everything to do with creative or editorial credit (who said what). Wikipedians are used to the "If you don't want your writing to be edited mercilessly or redistributed for profit by others, do not submit it" disclaimer, but in my view when you hit "save" on the text of works written by others (even if PD) you are submitting that text for merciless editing without the permission of the original writer. You should be quoting it instead - and this preserve and attributes the original wording, and insulates it against change. For examples of people quoting PD material, instead of submitting it for merciless editing, see various articles using quotes from NASA documents. And if you think this was long, start here. Carcharoth (talk) 23:05, 13 April 2009 (UTC)[reply]
Why would you ever assume that the article is not entirely authored from the credited "Silvics Manual" or that outside of formatting and updating data the Silvics Manual could ever be superseded by random people on the internet? Wikipedia is amazing but in the few corners where neutral, encyclopedic, free information written by field experts already exists we are not going to top it. We are simply going to use it. For anyone to seriously suggest rewriting Pignut Hickory in paraphrase and then rearranging the structure of the article merely so they may remove the template crediting the true source is ridiculous. The credit on that article is clearer than any NASA picture you will find in Saturn. You don't even have to click on a link. I see no problem with giving even stronger credit in these sorts of articles, but those templates have been used forever here and edit summary credit is the most strongly embedded credit possible. You cannot claim either of those has been used a dismissive sort of credit because you now wish suggest something new. I think it is unrealistic to suggest a reader of Wikipedia would see Pignut Hickory and think "WP has done a great job." While some newbies might any regular Wikipedia reader would much more likely think "This wasn't written through Wikipedia".--BirgitteSB 00:08, 14 April 2009 (UTC)[reply]
P.S. If I could credit authorship how I truly prefer, I outlined a method here But we are confined by the software.
tl;dr version - "I think the basic problem is that the way Wikipedia is written, it is exceedingly difficult to tell who wrote what - which bits were Wikipedia editors, and which bits are external bits imported in. And I think that distinction is important to maintain for editorial integrity, if nothing else." (and be easy for readers to tell who wrote what, without digging around in page histories). Carcharoth (talk) 23:05, 13 April 2009 (UTC) or to put it another way, resurrecting a quote by Kablammo from the archive: "PD text is often used as the base for the article, but is modified before publication, so the article which appears is an olio of both the PD source and the editor's work, confusing the attribution. I understand that subsequent changes are reflected in the edit history"[reply]
Oh, my. :) (re:length of linked discussion.) I do think this is an issue that will need to be addressed in order to promote the proposed plagiarism guideline. I would love to see the guideline adopted, whatever route consensus goes on verbatim duplication of public domain text without quotation marks. We need some kind of official stance. --Moonriddengirl (talk) 23:22, 13 April 2009 (UTC)[reply]
I suggest: (a) rewriting and improving that proposed guideline; (b) archiving the talk page (what I pointed to was in fact the first archive); (c) summarising the changes on the talk page; (d) advertising the proposal widely (see here); (e) when things have settled down, make the point that some official guidance is needed and see if there is consensus to make it a guideline. The big sticking point will be whether the guideline reflects current practice, and whether current practice should change to reflect the guideline, or whether the guideline should be written to reflect current practice. The latter is the normal process, but assumes that current practice is correct. It might be that a culture change is needed first before the guideline will be accepted. Carcharoth (talk) 23:29, 13 April 2009 (UTC)[reply]

I think the discussion around this topic represents a conflict of values between what Wikipedia has been and what it can be. In order to gain some legitimacy, Wikipedians took public domain text and applied it to articles as filler material. But I am among the growing number of editors who see this project as being able to be much more than it is. Our standards continue to rise with time, and this is one of the issues that should be improved. Any material that was PD text should be tagged and an effort should be made to transform these passages and articles into original writing. Wikipedia should not borrow full passages or articles from other encyclopedias. It can not only rival others in quality, it should be able to surpass them. --Moni3 (talk) 12:21, 14 April 2009 (UTC)[reply]

I think your speculation into the motives behind importation is inaccurate. While I would agree that there no longer any reason to use out-dated PD material as a starter, not everything in the Public Domain is out-of-date nor of a quality we can easily match. I am not sure what articles you see here, but the "rising standards" here are nothing much hold up against any other resource. Comprehensiveness and accessibly we rock at, but quality standards? We are good at what we do, but we are not going to make our mark in surpassing all the field expert resources in their own field. I would be happy to support a guideline that restricts imports to a certain level of quality. I do not think everything is useful here. However I think you are very out of touch to suggest people put their effort behind paraphrasing and re-arranging articles merely for the sake of not giving authorship credit to free content author. Of course, I imagine when you suggest this you mean people other than yourself. Next you will want us to begin re-drawing all the PD maps and re-populating the PD charts. May we at least still use PD images from NASA spacecraft or must we somehow find a way to surpass them in quality images of other planets?--BirgitteSB 16:10, 14 April 2009 (UTC)[reply]
Well, I do what I can. Once Everglades was mostly PD from the 1911 Encyclopaedia Britannica. What a horror. It is no longer so, and I made it a priority to remove that 1911 tag. (One of its satellite articles appeared on the main page, which you commented on.) Please don't misunderstand me. I'm not implying that articles here are necessarily better than academic articles, books, and other peer-reviewed sources that are written by experts, but I do feel there are quite a few articles that are better than what corresponds in other encyclopedias, specifically ones that offer their text as PD. Perhaps I am out of touch, but I would venture to say that whoever suggested using inline citations and citing page numbers throughout FAs was also out of touch, but yet we're doing it right now; it's standard procedure. How many articles use PD text? Has that been determined? I am mindful of not being limited by the mindset that PD is all we have for now, so let's leave it as is. Our views on this can change. PD text was a measure to fill in articles where none existed. So now editors, much like in the Article Rescue Squadron, can take PD articles one by one and work to make them original. We can be better. Better than other encyclopedias, and better than what we were. --Moni3 (talk) 16:47, 14 April 2009 (UTC)[reply]
I am afraid I made an error in assuming you followed this whole thread. You are thinking of imports like the 1911 text at Everglades while I am thinking of import like the Silvic Manual's Pignut Hickory. I just don't see the net benefit in making the latter article "original", even though I can agree with that sentiment in regards to the former. --BirgitteSB 17:53, 14 April 2009 (UTC)[reply]

Well Done

[edit]

Well done essay, very revalent too as it is a problem we have, given the open nature of Wikipedia.  Marlith (Talk)  22:39, 13 April 2009 (UTC)[reply]

As one of the contributors, thank you. :) --Moonriddengirl (talk) 23:22, 13 April 2009 (UTC)[reply]
I second that. very well written. Bawolff (talk) 06:47, 16 April 2009 (UTC)[reply]

Podcast?

[edit]

Perhaps we should have a podcast on this issue! I never thought plagiarism was so fascinating and hotly-debated. I love it. Awadewit (talk) 23:49, 13 April 2009 (UTC)[reply]

Well, it wasn't hotly debated until the last year or so... Carcharoth (talk) 01:37, 14 April 2009 (UTC)[reply]

Let's get serious about writing an encyclopedia

[edit]

This dispatch misses the point entirely. Most plagiarised material is not encyclopedic, simply because the plagiariser has not made the effort to incorporate it into the existing encyclopedic corpus. If s/he had made that effort it wouldn't be plagiarism, evidently. Yet the dispatch seems to believe that it is the "plagiarism" which is the problem, not the generation of an encyclopedia; and that if things are properly sourced, the problem will go away! The authors have, at the very least, missed an opportunity; at worst, they have started to create yet another barrier against the creation and improvement of articles by those who are not already in the system.

Nobody would buy a CD which only contained featured articles: if they would, such a CD would have been published. This encyclopedia has grown to what it is, one of the Top-Ten websites, by the efforts of many people and through a lot of what would be called "plagiarism" if it were put in a Freshman essay without citation, and through many articles which are not yet perfect. While it is valid to wish to improve the citations on Wikipedia, and the paraphrasing, this is not the end in itself. To place the quality control of Wikipedia entirely in the hands of a small number of people, whether they be academics or winos or both or neither, is pernicious. Physchim62 (talk) 01:36, 14 April 2009 (UTC)[reply]

Respectfully, I think that plagiarism is a serious concern for Wikipedia, and I don't believe that the intention is to place quality control in any particular group's hands—hence the note that plagiarism is something that can be addressed by everyone. Bringing Wikipedia to a higher academic standard seems to me to be among our goals, and educating those whose freshmen essays would not pass about how to best contribute to an scholarly work seems like an equally good goal. Although I have seen a few plagiarists (and more copyright infringers) blocked from editing, it is my personal experience that most people can be taught how to avoid these problems. I don't think that the means become the ends quite yet; monitoring plagiarism is another way of maintaining quality, like verifying that sources are reliable and information is unbiased. One of the first edits I remember making to Wikipedia, as an IP in days of yore, was noting that a song I liked was the perfect example of something. It was properly reverted as WP:OR, and I got a friendly (if canned) note explaining why. I learned not to place OR. But I didn't stop contributing. --Moonriddengirl (talk) 02:09, 14 April 2009 (UTC)[reply]
I think BirgitteSB put it very well above (must be a chilly day in Purgatory if Birgitte and I agree on something!):
A Wikipedia article is not an essay assignment, nor a piece of academic scholarship. If you have the misfortune to come across articles that have been written as academic assignments or as pieces of original research, you will know exactly what I mean. This essay misses the point entirely: it says nothing about improving articles collaboratively. Instead it takes as its axiom that "plagiarism" is a problem. As such, it enters into the pernicious trend of individualism which is gripping some of the centralized processes of Wikipedia. Plagiarism is only a problem if we are handing out brownie points to individual editors, or participating in such sickening hagiographies as this one. However, if we take as an axiom that Wikipedia is a collaborative effort to organize and summarize human knowledge in a way that anyone can use, the problem goes away. Plagiarisms which breach copyright must be removed, as they are not free content. Other "plagiarisms" will be removed in the normal editing process, as they are usually bad text for an encyclopedia, especially one as eclectic as Wikipedia. Physchim62 (talk) 13:39, 14 April 2009 (UTC)[reply]
I think there have been scandals enough regarding plagiarism in professional publications that you and I may just fundamentally disagree on this. While Wikipedia's articles may be collaborative artifacts, the public that receives them does so in a sense as if Wikipedia were itself a fictive entity, an imaginary writer. Hence, the media splash (one of the earlier examples of) related to Brandt's accusations of plagiarism on the project. Now, most of what he was talking about was copyright infringement, and Wikipedia's editors who responded addressed it in that way. But the catchword of the day was "plagiarism"--on Fox, the San Jose Mercury news, a later reference on CBC news, and various websites. Hence, I think a foundational stance that "plagiarism is a problem" is warranted. (Unless I'm misunderstanding you, and you're not arguing the philosophy but the prevalence.) The view it creates beyond our borders is not that "Editor X" plagiarizes, but that "Wikipedia plagiarizes." I think given the nature of what I do on Wikipedia, those (who've noticed what I do :)) won't be surprised to hear that I agree that copyright concerns are imperative. But whatever definition of plagiarism Wikipedia may choose to embrace, I believe strongly that we need to officially embrace some definition to which we can point in the face of such criticism. --Moonriddengirl (talk) 13:56, 14 April 2009 (UTC)[reply]

Am I a plagiarist?

[edit]

I got called out at Talk:Hulk (comics) for some text I wrote. Here's what was on the talk page.

  • The insecurity and anxieties in Marvel's early 1960s comic books such as The Amazing Spider-Man, The Incredible Hulk, andX-Men ushered in a new type of superhero, very different from the certain and all-powerful superheroes before them, and changed the public's perception of them.(followed by a cite journal, you can see it on the Hulk page if you want)
  1. This may be considered plagiarized. The original reads "In that respect, the self-consciousness and anxieties of such early 60s Marvel characters as Spider-Man, The Hulk, and the X-men can be seen as having ushered in an entirely new breed of superheroes that were quite unlike the virtually infallible and all-powerful superheroes that had come before, a breed of superhero that served to completely shift the superhero paradigm and the manner in which the public perceived and incorporated them."[1]

While I see a lot of similarities between my text and the "Problems in paraphrasing" section, the "Good adaptation practice" has similarities as well. It seems like the good version starts by naming the source, which I don't think has an effect different from an inline citation, and then does stuff that is (maybe) borderline OR, while not being that different. It changes " instantaneous, unblinking, cheap, and, maybe most importantly, easy" to "standard workplace technology", for example. I think there's some push/pull between OR and paraphrasing, I guess. A lot of times we're just paraphrasing a sentence or two. It's a lot easier to move away from plagiarism when summarising a large bunch of text, obviously. The good example kinda changes "X does Y" to "Y was done by X", which is supposed to not matter. In a scholarly situation, your own opinion is valid, but then not when paraphrasing, so maybe more OR is acceptable? I don't know. Anyways, I'd like to hear the opinions of the authors of this article on how I can improve. Thanks. - Peregrine Fisher (talk) (contribs) 04:16, 14 April 2009 (UTC)[reply]

I'm answering this at the user's talk, User talk:Peregrine Fisher, which see if you have interest in disagreeing or offering different opinion. While this is certainly a good place to seek a second opinion, I just thought hashing out specific examples might be off-topic for this page, which is so far very meta. --Moonriddengirl (talk) 12:56, 14 April 2009 (UTC)[reply]
Keeping it meta, I'm responding here. I think Peregrine Fisher's point:
"I think there's some push/pull between OR and paraphrasing, I guess."
is an important issue. You have to paraphrase the source without interpreting it. Clearly a hard thing to do. Mark Hurd (talk) 02:11, 15 April 2009 (UTC)[reply]

The edit history as attribution

[edit]

Fascinating essay, thanks folks. It has got me thinking about how we attribute authorship on WP using the edit history. If I create an article, there's nothing on the article to credit me. You can find out from the edit history. When someone rewrites the article, but keeps a few of my sentences, they don't put them in quotes. But the diffs on the history can show what has been kept and what was discarded. How is the situation different from Birgitte's upload of Glendon W. Smalley's text compared to say if Smalley had a WP account and wrote that directly into WP? I've read somewhere the edit history is vital for GFDL. Perhaps this modern technology supersedes quotation marks in some way. Could it be that standard references on plagiarism aren't yet adapted to the wiki format of publication and continual change by a mix of authors. Colin°Talk 17:49, 14 April 2009 (UTC)[reply]

That certainly could be. :) Maybe part of the problem lies in the point made by User:Carcharoth above: "Wikipedians are used to the "If you don't want your writing to be edited mercilessly or redistributed for profit by others, do not submit it" disclaimer, but in my view when you hit "save" on the text of works written by others (even if PD) you are submitting that text for merciless editing without the permission of the original writer." In my opinion, the "merciless editing" aspect of Wikipedia can create another serious problem with wholesale incorporation of PD text in that the PD template itself may become misleading. There isn't a stamp on an article saying, "This article incorporates text licensed under GFDL by User:Moonriddengirl. If there were, I might have some pretty serious ownership issues if people altered the text, say, to imply that I held views I personally find repugnant. At the very least, I would want more notices added to indicate that the text may not represent me. I think that the PD disclaimers serve an important function in preserving Wikipedia's reputation if we don't embrace a "quotation marks around quoted text" philosophy, since the transparency they provide can at least address concerns of intentional plagiarism. But if any guideline we wind up adopting as a community determines that quotation marks are not necessary, perhaps the PD templates should be altered to note that the material may have been substantially changed so that we avoid inadvertently misrepresenting the source's ideas. --Moonriddengirl (talk) 18:23, 14 April 2009 (UTC)[reply]
I think it would be very useful if the PD template had permalink to original dump for easy reference.--BirgitteSB 18:31, 14 April 2009 (UTC)[reply]
There are two issues: the need to credit the original writer even if PD (which can be done, albeit awkwardly, by reviewing the article history and reading the edit summary) and whether we need "permission" to play around with ("edit mercilessly") PD text. Think of a composer reworking an earlier piece, or art such as L.H.O.O.Q.. Do we need permission if the person has no rights (or is dead)? Colin°Talk 18:53, 14 April 2009 (UTC)[reply]
We don't need permission for modifying Public Domain works (Just because they are dead does not mean PD necessarily). This is the whole basis of the free content movement; to solidify a framework by which people can modify works without asking the permission of the original author. PD is within that framework, so is GDFL, CC-by-SA, Free ART, and maybe some others I am forgetting. That is what we are doing here, whether everyone is aware of it or not, we are building the encyclopedia for that movement.--BirgitteSB 19:11, 14 April 2009 (UTC)[reply]
Re: the first issue which I missed. There is honestly not even a "need to credit the original writer" for PD. But we should do it as best we can, because it is the right thing to do. And the free content licenses I mentioned, all firmly support the "rightness" of attributing authors by requiring this for works they are attached to. But PD, you may technically and legally do whatever you like with it. Especially in the US.--BirgitteSB 19:16, 14 April 2009 (UTC)[reply]
(edit conflict) I believe that whatever policy/guideline we may arrive at, we must be scrupulous to acknowledge that we have utilized text from another source to avoid accusations of plagiarism (such as I described above). I do not believe the edit history is sufficient here, as our common readership (as opposed to contributor pool) is not to be expected to know how to access it. However, it's not unlikely that if they read an article on Wikipedia that they have read in a PD source, they will form a pretty bad impression of our scholarly integrity. For that reason, I believe that, whether we embrace quotation marks or not, attribution must be visible on the page. The "fictive writer" that is the mass of Wikipedia's collaborators didn't author the text, and we can't seem to pretend to. Given this, the question of "permission" is in my opinion really one of ethics—that we do not misrepresent our sources. If consensus should be to dispense with quotation marks, I would far prefer Birgitte's proposal of permalinking the original dump to incorporating the language of others without clearly acknowledging them. But, again, this is probably more territory for Wikipedia talk:Plagiarism, since any guideline that may be embraced by the community is likely to be embraced there. --Moonriddengirl (talk) 19:20, 14 April 2009 (UTC)[reply]
I second the notion that we must not misrepresent our sources. Even though I personally agree with you that current models of crediting authorship through edit history are insufficient (for all authors not just imported ones). That position is not the conventional wisdom. And without the desire of developers to make significant changes, I don't see how we avoid the fact the attribution here is primarily available through the edit history. I fully support the templates as secondary reinforcement of that attribution in these cases. I personally like the templates, but there are solid arguments against them. So while I support the templates, but it must be in the edit summary as well. The templates may fall out favor in the future, edit summaries are about as indelible as we get around here.--BirgitteSB 19:40, 14 April 2009 (UTC)[reply]
Thanks for your responses. I'll leave the "can we edit mercilessly PD text" debate for others, but if we can then I think explicit templates at the bottom will ultimately become pointless as the text deviates and as the format of WP articles slowly shifts away from paper-page layout. As Birgitte says, edit summaries and history logs are all the evidence we were ever here. Colin°Talk 21:37, 14 April 2009 (UTC)[reply]
Moonriddengirl is it your position that we cannot "edit mercilessly PD text"? Or that we can do so, but should not? Or that we can do so, but should do so only within the confines of what is allowed by the French interpretation of moral rights? Or none of the above ;)--BirgitteSB 21:44, 14 April 2009 (UTC)[reply]
Option E: edit mercilessly, but make clear that the text may have been edited mercilessly. :) I think a tweak to the PD templates to indicate that the text may have been altered would be a good idea. --Moonriddengirl (talk) 10:42, 15 April 2009 (UTC)[reply]

Fact ownership

[edit]

"In terms of plagiarism, if not copyright, the author also "owns" the facts or his or her interpretation of them, unless these are, as mentioned above, common knowledge." Is a bit misleading. The seminal case on facts (which is cited earlier) clearly states: "The first is that facts are not copyrightable". Their selection and arrangement or a specific expression of them yes, but not the facts. Aboutmovies (talk) 21:15, 14 April 2009 (UTC)[reply]

This is one of the differences between plagiarism and copyright infringement. In copyright, facts cannot be owned. In plagiarism, they can be. (For quick reference, see footnote 4.) --Moonriddengirl (talk) 21:22, 14 April 2009 (UTC)[reply]
Yes, I know they cannot be owned, we covered that the first week of Copyrights, and I quoted as much above. The problem is that by including in the article "if not copyright" it misleads the reader to think that facts may also be copyrighted. Which is the problem. And if you want to write about copyright, might want to use Nimmer, as even SCOTUS does. Aboutmovies (talk) 22:16, 14 April 2009 (UTC)[reply]
"In terms of plagiarism, if not copyright" means "In terms of plagiarism, even though not copyright." It's definition #4. It's a fairly standard construction. --Moonriddengirl (talk) 22:53, 14 April 2009 (UTC)[reply]
That's not how I read it or others read it. I read it as "In terms of plagiarism, and maybe copyright". And the problem with your argument is your definition is for IF alone, and not for "if not", so if we go with yours, then the "not" would modify it, and thus reverse it, as is what is covered in #5 of your example where it actually discusses the use of the two together, thus the sentence then reads: "In terms of plagiarism [and perhaps not even] copyright" which, again, means perhaps, or possible, or maybe, and the other synonyms, and not never as is the case with facts and copyright. Aboutmovies (talk) 06:36, 15 April 2009 (UTC) amb[reply]
You read "if not" to mean "and maybe? Hmm. Well, if others might misread it similarly, than I'll change the words. But are you suggesting with your examples that "“He was smart if not exactly brilliant” is meant to be read, "He was smart and maybe exactly brilliant", I wonder if you should reconsider. The weaker/stronger pair connection suggests the one is true, if not the latter. --Moonriddengirl (talk) 11:03, 15 April 2009 (UTC)[reply]
Thank you for the change, that fixes the problem. But yes, the example I cite is how you phrase it is that the "if not" phrase is showing that there might be a connection between the before and after (I can't comment on completely on your weaker/stronger pair connection part since you decided to include the "if not" phrase in it, and I don't know how you are using it in that context). But, yes it does suggest one is true, but also that the other maybe true. If it were meant as only one is true, we would use "or". In the brilliant/smart example, it is saying we know the person is smart, but they might also be brilliant (or as some define the "if not" phrase as perhaps. Or look at it this way:
In the following are the actual phrase and one with a word removed:
  • "In terms of plagiarism, not copyright"
  • "In terms of plagiarism, if not copyright"
What you are suggesting is that both mean the same thing. Which means, what does the "if" do in the second sentence, the one that was in this dispatch? To me and the sources I've provided it changes the meaning from a distinct no, to a possibility, however remote. Aboutmovies (talk) 21:37, 15 April 2009 (UTC)[reply]
In the "and maybe" interpretation of the brilliant example, "exactly" doesn't seem to really work. "He was smart, though not exactly brilliant" functions; "He was smart and maybe exactly brilliant" does not—not without some atypical usage, anyway. ("He was smart, if not even brilliant" would seem more likely to support an "and maybe" interpretation.) However, I did some reading on this and Garner notes that the interpretation of "if not" is ambiguous, meaning either "though not" or "maybe even", and may be prone to interpretation by the reader according to the usage with which he is more familiar. As a result, he recommends avoiding the phrase altogether. Perhaps there are regional differences in prevalence. "If not" to indicate "even though" comes very naturally to me; hence, I repeated it above without even noticing. Meaning "and maybe", while not completely foreign, is far less common in my experience and does not flow trippingly from my tongue. Which is just as well, since it seems I'd do best to ditch the construction altogether. --Moonriddengirl (talk) 22:46, 15 April 2009 (UTC)[reply]

Really, if it doesn't apply to copyright, why even mention copyright? You already qualified the sentence that you are discussing plagiarism. This is the reason why it is detrimental to attack non-infringing "plagiarism"! A simple fact cannot be copyrighted. I cannot copyright the fact that water boils at 212 °F (at sea level, at least). Who taught me that fact? I certainly didn't know it on the day I was born! Who am I "plagiarising" by sharing that fact with you today?

The authors of this dispatch have tried to invent a new kind of "ownership" of knowledge, which goes contrary to everything that Wikipedia stands for. It will no longer be sufficient to shae knowlegde, your article will also have to pass a new "Plagiarism Police" which will surely have to make you recall the names of the authors of every book you've ever read and every teacher you ever had since kindergarten if they were to do their job properly. Physchim62 (talk) 02:12, 15 April 2009 (UTC)[reply]

Primarily to reassert that it doesn't apply to copyright. If we have attempted to invent a new kind of "ownership" of knowledge, we have been very clever in previously planting the idea in reliably published sources, some of which are cited. Even more cleverly, we must have also planted them in sources we did not use, perhaps for just this very eventuality: [4], [5], & [6]. Bill suggests the idea predates us. Indeed, in my discipline, it's common knowledge. (As is most of what you will have learned from your kindergarten teacher.) --Moonriddengirl (talk) 10:54, 15 April 2009 (UTC)[reply]
Plagiarism certainly isn't a new concept, but there is a reason you are putting ownership in quotes. It is because plagiarism has no legal basis and cannot really use the same terminology as intellectual property without being misleading. The facts are not owned. But there is a social convention is place, where if one does not treat the facts as proprietary they risk being shunned by the society despite the fact that there are neither criminal nor civil charges which can be made against them.--BirgitteSB 13:10, 15 April 2009 (UTC)[reply]
p.s. I do not mean to imply that shunning is a less serious outcome than a summons to court. They are just different systems of society. The rule of law and honor system are not really comparable in absolute terms.--BirgitteSB 13:24, 15 April 2009 (UTC)[reply]
Putting it in the language of ownership and belonging may contribute to some confusion, to those who think more immediately of copyright, but I'm not really sure how else to convey the concept that social convention grants creators credit. I suspect that the language used in plagiarism probably informed that used in copyright rather than the other way around. I haven't really studied the history of plagiarism, though, so I could be wrong. But at its linguistic heart, plagiarism refers to theft—the taking of property that does not belong to one. It's based on the Latin "plagium", which was most commonly used for kidnapping. (Some at Guardian, here; more precise in Studies in Roman Law.) --Moonriddengirl (talk) 22:46, 15 April 2009 (UTC)[reply]
I think the way to convey this is to focus on the right of credit rather than the right of ownership. After all the originator cannot deny someone the use of facts as they may with property that they own. Nor are facts treated as property under the control of some owner in any other way. Ownership and theft can be used metaphorically, but are rather inaccurate when we wish to really explore the concept. Or at the very least the when not speaking metaphorically it should be clear that the that which is owned or pilfered is the credit to thing rather than the thing itself. I understand the other people do commonly use this language carelessly, but people also commonly say someone stole their boyfriend. It doesn't mean that situation is actually theft or a boyfriend can be owned.--BirgitteSB 13:53, 16 April 2009 (UTC)[reply]
The metaphorical ascription of ownership to information, creativity, culture, etc.—the idea that these are in some ways ownable resources, sometimes even marketable commodities—is, of course, pretty wide-spread; we not only see it in discussions related to copyright and plagiarism (where even in the definition you supplied ownership is implicit in the term "borrowed"), but in other areas such as cultural appropriation. I think using the common metaphor is probably safe, although I agree that clarity is important. I think clarity is provided not only in the use of quotation marks where ownership extends beyond copyright, but in the section "Understanding plagiarism", which says, "The problem with plagiarism is not that it involves the use of other people's ideas, but rather that other people's words or ideas are misrepresented—specifically that they are presented as though they were "an editor's own original work"." Also, right after the introduction of the word "own" (in this context; that is, not as part of "own original work" etc.), the dispatch says, "Revising to avoid plagiarism means completely restructuring a source in word choice and arrangement while giving due credit for the ideas and information taken from it." I don't think this dispatch could be misinterpreted in context as indicating that facts can't be used, but merely (with respect to facts) that unless facts are common knowledge credit must be supplied. (Which, of course, imposes no additional burden to Wikipedia's editors as it's already policy, albeit for different reasons. Even the term "common knowledge" hearkens to the notion of a common-pool resource.) (In yet another parenthetical aside, as a follow-up to my last note, I was curious enough to glance at the history of plagiarism, although I'm still struggling through today's batch at CP prior to doing some of that stuff I'm paid for doing in the real world. According to this, plagiarism has been ruining reputations since Greco-Roman times.) --Moonriddengirl (talk) 14:41, 16 April 2009 (UTC)[reply]
I was commenting a not on the original dispatch as much as I was on the above conversation over specific topic of ownership of facts. I didn't notice anything on the dispatch to complain about on this point. When I said there was a reason for the quote marks around own, I meant I found them an appropriate indication of metaphor.--BirgitteSB 14:51, 16 April 2009 (UTC)[reply]
Then almost all of what I said above was a pointless tangent. :) Still, people may be interested in the history of plagiarism. I'd like to get my hands on the whole book at some point. --Moonriddengirl (talk) 15:02, 16 April 2009 (UTC)[reply]

On plagiarism of public domain sources

[edit]

This is a good article, but I also felt like it did not give enough emphasis to how we permit appropriation of text from free and public domain sources like EB1911. Attribution is always important, but if the text of a source is incorporated into the article and adapted, quotation marks are not required. I write a blog where I release all posted content into the public domain, and to be honest I want Wikipedia to steal my text. It's also not clear to me how one can "misrepresent a source's ideas as one's own" if the article is not even supposed to have any ideas of its own (these would be original research). On the other hand it is important for us to increase awareness of close paraphrasing of non-free sources, which have been slipping under the radar and poisoning huge collections of articles. Dcoetzee 22:39, 14 April 2009 (UTC)[reply]

Well, we know that many contributors do, which is why we have templates like {{or}}. But if the "one's own" language is from the opening sentence, that might best be taken up, too, at Wikipedia talk:Plagiarism, since its quoted from that guideline. Still, while definitions of plagiarism may be adopted in Wikipedia's guidelines according to its own needs (and the question of quotation marks is obviously a hotly debated one above :)), I think we do need to acknowledge general definitions as well. Even if Wikipedians would not presume that the ideas of a source are "one's own" (though our general readers might, lacking familiarity with WP:OR), they would not know whose they are, which is misrepresentation of the source.
Given that several contributors have disputed the need to use quotation marks, I have inserted a footnote noting the dispute and directing to the talk. I'm not sure the protocol of such a note, so I trust that it'll be reverted if it violates typical practice. --Moonriddengirl (talk) 23:29, 14 April 2009 (UTC)[reply]
To make an analogy, per WP:OWN individual editors are not credited in articles; yet each individual editor is an author who, as much as any previously published source, is entitled to attribution by content reusers. The reason we don't credit in articles is that to do so would be utterly impractical; there is no effective way to visualize this information while keeping articles readable. In light of this, I suggest the following approach:
  • Original, unedited dumps of content from the source should be preceded with a notice indicating the source (e.g. "The following material is taken from X.") This provides fair warning to the reader, who may otherwise be perplexed at the antiquated writing style or POV. All too often I've seen talk page comments directed at authors of the 1911 Encyclopedia Brittanica.
  • After preliminary editing, the notice may be moved to the end of the section or the end of the article, where it will say something to the effect of "This article/section is based originally on content from X." Alternatively we can have a single notice at the end saying "The X, Y, and Z sections of this article are originally based on content from X."
It is unnecessary and undesirable to track at the level of individual words and phrases what content is borrowed, once we've decided to absorb it into the editing process; if this implies plagiarism, then it is a necessary evil in order to enable the wiki process to function and our articles to be readable.
Unlike some other editors, I support only consolidating, not removing these notices, as it's just as important to indefinitely cite our content sources as our sources for facts. Even if not a ghost of their verbiage remains, they still influenced the article construction. Dcoetzee 00:23, 15 April 2009 (UTC)[reply]
I agree absolutely with your last point. And it's not just attribution that requires tracking the fate of public domain material that has been dumped into the editing mix, but verifiability as well. Someone might question a particular sentence, and do you then dig back through the page history trying to find whether that sentence came from the PD text or not? If you have a link to the original PD text used, then comparing is fine (and incorrect changes made can be reverted). But that is not always the case. It only works if you have a link to the diff where the full text was added, or a link to a stable version of the PD text (e.g. on wikisource or elsewhere). In other words, you gradually move from a whole article referenced just to the PD-text, to a composite article with new references, to a completely new article with references both to new sources, and to the original PD-text. If this route is followed (and it is a valid point to make that some currently published PD texts will be authoritative for a long time to come and need little changing), then the editing process should (in theory) produce the same result as if you had sat down with the new and old sources and written the article from scratch. Personally, I think that the latter approach (using the new and old sources and writing the article from scratch) is better, but the alternative process can work as well. My fear is that some articles get "stuck" in that process, and spend much of their history as awkward, half-edited chunks of PD-text with little bits of new stuff incorporated. Carcharoth (talk) 00:50, 15 April 2009 (UTC)[reply]
I agree that it's a great idea to include a permanent link to the original dumped material on Wikisource in the attribution notice (it should generally be uploaded there in case it vanishes from its original site). The contention of those who favour content-borrowing is essentially that cleaning up content is easier than writing original prose, allowing us to produce more useful content with the same effort, even if it's not featured-article level content. I can attest this with e.g. my recent work on Wenceslas Hollar. Dcoetzee 00:59, 15 April 2009 (UTC)[reply]

"Even if not a ghost of their verbiage remains, they still influenced the article construction." As above, if this were taken to its conclusion, we should all be citing our kindergarten teachers as essential parts of the process that led to the creation of Wikipedia articles. Yet again, I shall say that the dispatch is a bad essay because it has a false axiom. Please improve wikipedia articles (instead of making pointless metadiscussion) and please cite your sources. There we go, simple. Physchim62 (talk) 02:28, 15 April 2009 (UTC)[reply]

I'm talking about manifest things like structure, layout, and choice of facts presented, not some kind of subtle philosophical influence. We don't remove users from the history whose contributions have been rewritten. Content sources should be listed. Besides, that isn't even the crux of my argument - I'm advocating here for the inclusion and merciless editing of free content as has been practiced in the past. Dcoetzee 05:14, 15 April 2009 (UTC)[reply]

Public domain text, attribution, and best practices

[edit]

It is time for Wikipedia to deprecate the inclusion of public domain text, unless quoted and specifically attributed. Here are some concerns:

  • Out-of-copyright and other public domain text often have a style and point of view at odds with Wikipedia standards and expectations.
  • When public domain text is edited and altered prior to first publication on Wikipedia, attribution is confused, as it is impossible to tell which portions are from the editor and which are from the source text.
  • The public domain templates which appear in varying locations near the foot of the article is not prominent and often are overlooked, and in any event one cannot easily tell which parts are from what source, as by convention or practice specific attribution to sources is not often used for PD text.
  • Even if the entire text is specifically attributed to the original author (not editor), and the template is interpreted as such, later changes can have the effect of devaluing that author’s work. We have all seen articles deteriorate over time; given human nature and the dynamic nature of a wiki there is no practical way to assure deterioration will not occur. So even if we (quite properly) credit a named author with the original unaltered text, it may be assumed that later factual errors and points of view are also the work of that author. The way to limit such confusion is to enclose such text within quotes, cited to the author.
  • The use of text other than the work of the wikipedia editor may impede later recognition and promotion of the article should the editors decide to submit it to such processes.

The most important reason however is simple standards. Wikipedia need not, and should not, confuse the issues of authorship and attribution by the wholesale copying of public domain text without quotes and specific attribution. That is fairer to the original authors and clearer to the readers. There are other places where out-of-copyright sources can be reproduced.

Undoubtedly, the use of public domain text on Wikipedia has long been encouraged. Nothing I write here should be interpreted as a criticism of those who have followed this common practice (much less criticism of anyone posting above). But it is time to move beyond that, and to make sure that our collaborative efforts are in fact our efforts, and where public domain text is used, it should be specifically marked and cited to its source. Kablammo (talk) 15:13, 16 April 2009 (UTC)[reply]

Erm, no! your comment actually illustrates very well why this dispatch is noxious and pernicious for Wikipedia. Physchim62 (talk) 17:05, 16 April 2009 (UTC)[reply]
To take your bullet points one by one:
  • Text which does not comply with Wikipedia "standards and expectations" should be edited or removed, regardless of whether it is public domain or GFDL.
  • Wikipedia is a transmission of knowledge, not a transmission of attribution.
  • The history pages of articles are not widely viewed either: public domain sources already get greater recognition of their authorship than Wikipedia editors, for no apparent or logical reason.
  • As above: attribution is good, but transferring knowledge is better. None of the editors who support a plagiarism guideline have yet cited the people who taught them how to read and write, yet this is surely a fundamental part of their Wikipedia activity. They are plagiarizing their parents and teachers!
  • Wikipedia is not about writing featured articles. I cannot say that too strongly. People who think the opposite should leave the project, IMHO, because they are steadily destroying it. If you are neutral on the issue, you might like to read the Five Pillars of Wikipedia to convince yourself.
Physchim62 (talk) 17:26, 16 April 2009 (UTC)[reply]
Surely you haven't missed all of the above about common knowledge? And, if you have, you must have read that in the dispatch? In most relevant excerpt, "Generally, if information is mentioned in many sources, especially general reference sources, and easily found, it is considered common knowledge." Language transmission, though certainly a great service that parents provide their children, does not require attribution. There's plenty of resources on the subject: [7], [8] --Moonriddengirl (talk) 18:04, 16 April 2009 (UTC)[reply]
The trouble with "common knowledge" is that it's a bit like "common sense" – not all that common, in fact. The example given in the dispatch (the posthumous publication of the poetry of Emily Dickinson) is frankly laughable. It's also "common knowledge" that ethanol forms an azeotrope with water, but I bet that none of the authors of this dispatch would let me put that into a Wikipedia article without citing a reference! The very fact that the author's decided to include a "common knowledge exception" speaks volumes for their desire to burden Wikipedia with yet another centralized structure in which articles will be judged against an arbitrary dogma by people who don't know a single damn thing about the subject matter in front of them – a bit like WP:FAC in fact.
I'm glad that there are resources on the subject of plagiarism: perhaps the authors of this dispatch could use them improve our encylopedic article on the subject. Allow me to quote from the current version:

Plagiarism is presumably not an issue when organizations issue collective unsigned works since they do not assign credit for originality to particular people. For example, the American Historical Association's "Statement on Standards of Professional Conduct" (2005) regarding textbooks and reference books states that there is no question about taking credit for someone else's ideas. Since textbooks and encyclopedias are summaries of other scholars' work, they are not bound by the same exacting standards of attribution as original research.

Physchim62 (talk) 20:44, 16 April 2009 (UTC)[reply]
I am utterly perplexed by the way you keep speaking about this as though it were all brand new. The authors of this dispatch (of which I am one) included a "common knowledge" exception because a "common knowledge" exception is part of the social construct of "plagiarism." We didn't make it up. Hence, "What to cite: the "common knowledge" exception" is a subsection of "Understanding plagiarism." Failure to include a "common knowledge" exception—whatever burden that may or may not put on Wikipedia's editors who wish to be respected in the scholarly community—would not be complete coverage of the subject. But, by all means, lets improve our article on Plagiarism. Allow me to quote directly from the American Historical Association's "Statement on Standards of Professional Conduct" (2005), rather than our version of it:[9]

Plagiarism can also include the limited borrowing, without sufficient attribution, of another person's distinctive and significant research findings or interpretations. Of course, historical knowledge is cumulative, and thus in some contexts-such as textbooks, encyclopedia articles, broad syntheses, and certain forms of public presentation-the form of attribution, and the permissible extent of dependence on prior scholarship, citation, and other forms of attribution will differ from what is expected in more limited monographs. As knowledge is disseminated to a wide public, it loses some of its personal reference. What belongs to whom becomes less distinct. But even in textbooks a historian should acknowledge the sources of recent or distinctive findings and interpretations, those not yet a part of the common understanding of the profession.

Someone seems to have misinterpreted that in our article. I'm off to fix it. --Moonriddengirl (talk) 20:58, 16 April 2009 (UTC)[reply]
  • But it is time to move beyond that, and to make sure that our collaborative efforts are in fact our efforts This statement is really odds with the free content movement aspect of the project. We need to continue to ensure that our collaborative efforts are both compatible with and inclusive of free content. My effort is not in fact merely mine. Our efforts are not in fact merely ours. My effort and our efforts belong to all who are willing to also make their efforts free. On this website, at this moment, these efforts are labeled "Wikipedia". But these efforts are not solely nor merely Wikipedia.--BirgitteSB 18:08, 16 April 2009 (UTC)[reply]
  • I cannot disagree more strongly with Kablammo's statements. How many times have you been to see a film that was credited as "based on the book X by author Y"? For example, Lord of the Rings? The film felt like it was reasonable to credit the author of the book for their influence on the work, but they did not feel like it was necessary to clearly separate out the part based on the book from the part that they made up. And the words "based on" convey this perfectly. And that's with a licensed work by a living author - the case for public domain works is all the clearer. You may say that this is a reference work, not a film, but we encounter precisely the same practical difficulties in clearly separating one content source from another - and there's no reason a previously published source deserves some special status as compared to contributions from individuals. Additionally, I feel like he is exaggerating and misrepresenting the message of this dispatch, which is not that copying from public domain sources is "not okay" but that content sources need to be attributed carefully and close paraphrasing of non-free works is a copyright problem, statements that I agree with. Dcoetzee 23:02, 16 April 2009 (UTC)[reply]

Perhaps an example will help?

[edit]

Let me give an example of why copying and pasting without quotation marks can be a problem. Below, is a list of the first few Ten Commandments from Exodus 20 (King James translation, a multi-authored, PD text):

Original text (see here, for example):

"I am the LORD thy God, which have brought thee out of the land of Egypt, out of the house of bondage. Thou shalt have no other gods before me. Thou shalt not make unto thee any graven image, or any likeness of any thing that is in heaven above, or that is in the earth beneath, or that is in the water under the earth: Thou shalt not bow down thyself to them, nor serve them: for I the LORD thy God am a jealous God, visiting the iniquity of the fathers upon the children unto the third and fourth generation of them that hate me; And showing mercy unto thousands of them that love me, and keep my commandments. Thou shalt not take the name of the LORD thy God in vain; for the LORD will not hold him guiltless that taketh his name in vain. Remember the sabbath day, to keep it holy."

Copy and paste which has been mercilessly edited:

I am the Lord thy God, which have brought thee out of the land of Egypt, out of the house of bondage. Thou shalt have no other gods before me. Thou shalt not make unto thee any statues of any thing that is in heaven above or hell below. Thou shalt not bow down thyself to them, nor serve them: for I the Lord they God am a jealous God, visiting the iniquity of the fathers and mothers upon the children unto the third and fourth generation (but no more) of them that hate me; And shewing favor unto thousands of them that love me, and keep my commandments and worship me. Thou shalt not take the name of the Lord thy God in vain; for the Lord will not hold him guiltless that taketh his name in vain. Ignore the sabbath. (Integrates PD text from the King James Version of the Bible)

We can no longer tell what is from the PD translation and what is not. That is the problem that we are wrestling with here. How can we easily communicate to readers what has been altered and what has not? It is not an irrelevant problem, as the ideas and language of the writers of these PD texts deserve to be treated with same intellectual respect as those of writers' whose works are still under copyright. Awadewit (talk) 19:49, 16 April 2009 (UTC)[reply]

I think your example is completely irrelevant due to a poor choice of material. In the first passage, how do you see that we are treating the authors on par with authors of copyrighted material? Who are the authors, you do not even attribute them? The second passage would never be acceptable material for article text if it were not a quotation of significance. Maybe you would do better to try and restate your point with a more useful example.--BirgitteSB 19:57, 16 April 2009 (UTC)[reply]
The point is take an extreme example to demonstrate the issues at stake (and many multiauthored, old PD texts have no authors - that is one of the interesting issues). Awadewit (talk) 20:01, 16 April 2009 (UTC)[reply]
But it is too extreme because it is not suitable neutral encyclopedic material which could be imported as an article or a significant portion of an article. Absolutely no one is suggested this sort of material should be used outside of quotations.--BirgitteSB 20:08, 16 April 2009 (UTC)[reply]
But many articles use the Bible as a source, so it is actually not all that crazy. Moreover, if you are unwilling to "mercilessly edit" the Bible but you are willing to mercilessly edit other PD sources, then we have a problem. That means PD texts are being divided up along arbitrary lines into "ok to edit" and "not ok to edit" - who is making that decision? What are the criteria for that decision? We need to highlight these issues. This example does that, as well as highlighting all of the problems Carcharoth outlined above. Awadewit (talk) 20:13, 16 April 2009 (UTC)[reply]
I never said it wasn't "OK to edit", I said it wasn't suitable for Wikipedia. It is some POV dictate, not neutral encyclopedic information. Everyone agrees we should not be using that passage as the basis of Wikipedia article. So it does nothing to clarify the point of disagreement.--BirgitteSB 20:34, 16 April 2009 (UTC)[reply]
I'm unclear why you think Biblical passages should be quoted but other PD material need not be. Do you think only NPOV sources that are in the PD should not be quoted? Awadewit (talk) 20:42, 16 April 2009 (UTC)[reply]
I never said Biblical passages must be quoted, I said such passages are unsuitable as articles. As the disagreement is over PD text as articles, using text which is unsuitable for an article is a pointless example. I have no problem making copies of Biblical passages outside of quotation. See Such copying is not plagiarism, but it has little relevance to what might be done on Wikipedia.--BirgitteSB 23:05, 16 April 2009 (UTC)[reply]

I don't have a specific paper in mind now, but suppose someone reproduces verbatim on Wikipedia a soil science paper from the USDA and uses the template to describe it as public domain. Thereafter it is edited, and now the USDA's work, and perhaps even the scientist's name, are associated with bowdlerized or degraded text. It would be little comfort to the authors to state that the original and correct version is buried in the edit history. Kablammo (talk) 20:15, 16 April 2009 (UTC)[reply]

How would the degradation or bowdlerization of such a text be be in line with Wikipedia's standards? My experience tells me changes to such a text would be very minimal. But it not much of an argument to say that hypothetical changes which are unacceptable by Wikipedia standards would be unacceptable to the authors.--BirgitteSB 20:34, 16 April 2009 (UTC)[reply]
You may be right, if articles are continually watchlisted by people who are familiar with the original text (and the actual authors likely are not Wikipedia editors) and protect the article. But we know that is not the case. People lose interest and retire.
But a more salient question may be this: What is the objection to using quotations for verbatim quotes? What is the objection to specific citation of each excerpt? Kablammo (talk) 20:38, 16 April 2009 (UTC)[reply]
(ec response to BirgitteSB) 1) You are acting as if we live in some sort of ideal world in which every user follows Wikipedia's standards. Such an assumption is unwarranted and basing policy on it will only lead to disaster. 2) Good content could be added, still distorting the original meaning of the text. Awadewit (talk) 20:40, 16 April 2009 (UTC)[reply]
The objection is Pignut Hickory. Please paraphrase or verbatim quote out everything from the Silvic's Manual and then figure out how to change the structure of the article enough to remove all traces of the original authorship and let know how many hours it takes you. And then explain how that procedure is a net benefit, because it seems too high of a cost for too little a benefit to me--BirgitteSB 20:49, 16 April 2009 (UTC)[reply]
So, plagiarizing is easier? That is your argument at the end of the day? Of course it is, but it also unethical. Awadewit (talk) 20:55, 16 April 2009 (UTC)[reply]
That is is not plagiarizing. It is all attributed and I am not claiming credit of authorship. Do I need to go over this again?--BirgitteSB 20:59, 16 April 2009 (UTC)[reply]
Attribution alone does not take care of the issue. There is no discussion of plagiarism in a RS that I've seen that says "attribution is sufficient". One must also take care to quote borrowed language. Awadewit (talk) 21:02, 16 April 2009 (UTC)[reply]
Short version That is no different than [ this. Neither are plagiarism. Plagiarism is only within the context of claims of authorship. All the sources on plagiarism discuss it within that context. I am not claiming credit for the work neither by submitting it as my work in an academic environment, nor by publishing it under my name. Please read this page above.--BirgitteSB 21:10, 16 April 2009 (UTC)[reply]
We're going in circles. It is silly to think that we don't have to have ethical standards outside academia or because we have changed the nature of authorship on Wikipedia. It is silly to think, for example, that we only need to quote PD works in a paper, but not on Wikipedia. Authors deserve respect no matter where they are being quoted. Expedience is not a reason to ignore ethical considerations. Awadewit (talk) 21:19, 16 April 2009 (UTC)[reply]
You must understand that I hold myself to a high ethical standard and that plagiarism is a serious, and in this case, false charge which I cannot allow to be left unchallenged. If you tire of going around is circles then drop the accusation. You are incorrect, and I have shown evidence of why the edit is not plagiarism which you ignore. Drop it or keep going in circles, I have no choice but to answer your accusation.To your most recent point. There was nothing to place a quote in, as I was not writing anything.--BirgitteSB 21:27, 16 April 2009 (UTC)[reply]
You are writing something - you are writing an encyclopedia article. Awadewit (talk) 21:48, 16 April 2009 (UTC)[reply]
No I was not. Not one bit of that article is my writing.--BirgitteSB 22:52, 16 April 2009 (UTC)[reply]
And if you must continue to respond on this point. Please do so on my talkpage and spare everyone else.--BirgitteSB 22:54, 16 April 2009 (UTC)[reply]

This whole debate is silly. Awadewit and the other authors of this dispatch have forgotten that Wikipedia is meant to be an encyclopedia and not a MMPORPG. They have yet to produce one single realistic example of why a new guideline is needed. Their entire argument is one of the more pernicious examples of instruction creep I've seen since – I don't know, maybe the last time I visited WT:MOS ;) It is pernicious because any time wasted on chasing up non-existent "plagiarism" is time that won't be spent improving our encyclopedic content. Physchim62 (talk) 21:29, 16 April 2009 (UTC)[reply]

What's a MMPORPG? Awadewit (talk) 21:47, 16 April 2009 (UTC)[reply]
Massively Multi-Player Online Role-Playing Game. Physchim62 (talk) 21:50, 16 April 2009 (UTC)[reply]
I'm not sure why anyone would consider Wikipedia that kind of thing. Perhaps you could explain. I came here to write articles about topics that were poorly covered on Wikipedia and which I had done research on for my dissertation. By the way, it is a mistake to compare the plagiarism guideline to the MOS. Plagiarism is an ethical issue - the majority of the MOS is not. Awadewit (talk) 21:55, 16 April 2009 (UTC)[reply]
Something else that's not new to this dispatch: a proposed guideline on plagiarism. Why, it's been here under proposal and ongoing development since June of 2008. We did mention that in the dispatch. --Moonriddengirl (talk) 21:34, 16 April 2009 (UTC)[reply]

Externally-produced Free Content text should continue to be used to help reach our goal

[edit]

I agree with Physchim62 and BirgitteSB that this essay takes a rather extreme view on plagiarism in reference to the reuse of free content (which, of course, includes PD text). The proposed guideline at Wikipedia:Plagiarism appears to present a pretty good compromise that allows the use of free content so long as it is properly attributed using inline cites and/or attribution templates (which is a lot more attribution than we grant other Wikipedians). Further, Wikipedia is part of the larger Free Culture movement, which encourages reuse and (often) modification of free content.

Given that, I fail to see how reusing free content that explicitly gives permission to modify and redistribute at will is at all wrong (PD text simply revokes all such rights or those rights have expired). As Physchim62 stated, Wikipedia is not an academic game of one-upmanship on who can author the most or best content. It is a work of many different people who provide free content with the goal of presenting the sum of human knowledge in an encyclopedic format. I don't see why we should limit the use of good free content just because it happened to be authored somewhere else. That would seem to be a hindrance to our goal. --mav (talk) 00:24, 17 April 2009 (UTC)[reply]

Their arguments - unconvincing to me - are that it makes a reference work less useful if you can't successfully distinguish which parts come from which source. Wikipedia itself contradicts this concept. The key that must be emphasized here is that the borrowing of ideas and the borrowing of content are separable and must be considered independently, even if we borrow both of these from the same source. We must make it clear, using citations, which ideas come from which sources. But, provided this is carried out, the borrowing of content from a source is an entirely practical matter to facilitate expeditious article writing, and attribution is only important insofar as it concerns tracking the set of authors of the article. Plagiarism is the use of another's ideas without due credit, not merely appropriating their text.
In fact, I would take this concept farther - if a content source such as the 1911 Encyclopedia Brittanica does not add any original ideas of its own, but only cites those as others, it should be listed as a content source or author, but not as a reference (instead its references should be cited as references). To cite such an article as a reference is every bit as absurd as citing an individual Wikipedia editor as a reference. Dcoetzee 01:53, 17 April 2009 (UTC)[reply]
"free content that explicitly gives permission" is not quite the same as "those rights have expired". Sure, the reason the rights are forced to expire is to ensure free circulation of knowledge, but the difference between giving permission at the time of publication, and something falling into the public domain by age is the difference between free content licenses and public domain works. They aren't the same. And for anyone wanting to examine how a PD-text on Wikipedia can change over time, look at some of the articles that started out as republications of 1911 Britannica articles, and have been steadily modified over time. Anadyr River was an example I used. I tried there, to make clear which bits of the article are from EB 1911, and which bits were added. That example is complicated, though, because what the EB 1911 article covered, we cover in two articles, the second of which is Gulf of Anadyr. This kind of splitting of a PD text over several articles is not that uncommon on Wikipedia. All-in-all, it is often better just to write the article from scratch, using both old and new sources, PD and copyright and copyleft sources, and in general you nearly always end up with a better article because you have considered the sources as a whole, rather than copy-pasting one source and modifying it as you go along (compare with the discouraged process of copy-pasting a copyright text and editing it to summarise it - better to read the source and then try and summarise it from notes). Carcharoth (talk) 01:58, 17 April 2009 (UTC)[reply]
Let me create a thought experiment here. Say I build a time machine, and go back to 1911, and find the guy who wrote the article on the Anadyr River. I bring him back with me to 2004, where he creates an article on the Anadyr River on Wikipedia and types his writing into it. The resulting article is exactly the same, and the author is exactly the same; on what basis is this treated differently? Dcoetzee 02:04, 17 April 2009 (UTC)[reply]
It is treated differently because in your example, the author brought back from 1911 promptly renews his copyright and sues Wikipedia. :-) No, seriously, you are assuming the 1911 author wants to create the Wikipedia article. That is a presumption we are making on behalf of authors of PD texts. Legally, we can do that (use their text as a seed for our articles). Morally, I'm not so sure. Carcharoth (talk) 02:26, 17 April 2009 (UTC)[reply]
Carcharoth you are confusing plagiarism with copyright. You cannot "explicitly give permission" to allow plagiarism. Plagiarism is never allowed no matter what the author will permit and protection against plagiarism does not expire. You are talking about copyright with what you quoted. And as far as copyright goes there is no difference between uses which explicitly permitted and uses were the ownership right has expired.--BirgitteSB 02:15, 17 April 2009 (UTC)[reply]
You are right, I've mixed things up here. What I think I was thinking of was comparing the way Wikipedia editors explicitly sign up to a free content license that allows modification and reuse of their writing, and how those long-dead authors whose works have entered the public domain had not signed up to such a free content license. Carcharoth (talk) 02:26, 17 April 2009 (UTC)[reply]
The moral issue: The moral issue with plagiarism is not using an another's work in ways they do not anticipate or approve of, but rather it presenting another's work as your own work. Plagiarism is entirely about ensuring your writing is truly your work in both concept and structure. So imagine that this Anadyr River 1911 article is inside a Wikipedia article entirely encased in quotation marks. The author still did not agree to be used as part of a Wikipedia article, but he is here and in no view is that a moral concern with plagiarism. This covers reuse but what about modification? Same thing but in the French article in translation. Now his work has been modified and reused without his agreement and under no definition is it amoral plagiarism. The moral concern that you describe above has nothing to do plagiarism, I imagine it is probably due to a strong belief in individualism. But it is really not the issue we are discussing here. --BirgitteSB 02:46, 17 April 2009 (UTC)[reply]
(ec) I treat externally-produced free content text pasted into Wikipedia similar to how I treat text written by other Wikipedians ; I try to improve it. However, sometimes, I can't really improve upon the externally-produced free content text. Same story for free content produced on Wikipedia. The big difference is that the externally-produced free content text is often from a reliable source, so it can be cited to itself. I also give attribution using edit summaries, attribution templates and adding things like "(public domain text)" to the end of inline cites. --mav (talk) 02:19, 17 April 2009 (UTC)[reply]
But that's part of the difference, isn't it? Wikipedians aren't doing original research and producing a reliable source, whereas those people who wrote the 1911 EB really did do their best to produce an accurate encyclopedia written in an engaging style at the time. Are we going to treat the 2009 EB differently from the 1911 EB because of which year it was published? That seems a bit arbitrary to me. Awadewit (talk) 02:28, 17 April 2009 (UTC)[reply]
I personally don't care much for the use of copyright expired text b/c fixing style and accuracy issues often is more work than drafting original prose using more reliable and recent sources. I also concede the point that editing of copyright expired text which was never intended to be edited by anybody but the original author and his/her editor, possibly has some moral rights implications that transcend copyright. What I'm strongly against, however, is the implication that the use and editing of free content that was drafted as free content and is properly attributed/cited in wikipedia articles is at all wrong so long as the modification of that text is allowed. The former may have some moral rights issues but the later certainly does not. I think we should separate those issues in our discussions. --mav (talk) 22:08, 18 April 2009 (UTC)[reply]
  • Moral rights under the Berne Convention expire with the copyright term as I recall. I'd agree that we can extend further consideration of those rights here, but not to the extent that it prevents us from incorporating and reworking the original author text. To me that requires extending a theory of mind to the original author, where we ex post facto decide that they considered their text inviolable. Given the present-day choice, who's to say that they wouldn't equally say "go for it"?
  • I disagree (probably strongly) that "citing" reusable free content is sufficient. It must be attributed, which means that the edit history must very very clearly indicate that I COPIED THIS TEXT FROM A FREE SOURCE (allcaps intentional). Sufficiently reworded and incorporated with other sources, yes, a citation is sufficient. That's the line I see - citing means you either paraphrase it or quote it and don't change it ever; properly attributing means you can change it just so long as the original anchor is there showing clearly just who the original author was and verifying the original free source.
  • And agree on the inadequacy of some very old writing. But we do have to consider all forms, especially considering the possible new compatibility between GFDL and CC-BY licensing.
  • Oh yes, for further consideration - original authors of early photographs by some definitions would also be having their moral rights discounted due to reworking of their efforts into presentation (and featured) quality versions. How is this different from text? Franamax (talk) 01:16, 19 April 2009 (UTC)[reply]
As I noted at the echoing empty wasteland that is WT:Plagiarism, if I place verbatim unquoted text and explicitly say "I copied this from a public domain source", it is demonstrably not plagiarism. It may be bad form (the opinion of several - Awadewit?) and it may not allow the proper tracking of evolution of the creative input (the opinion of others - Carcharoth?) - but who cares? As long as the PD inclusion is carefully noted and attributed, it's progress in editing is exactly described via that handy history tab. The original author is thoroughly dead, I have some sympathy for Carch's viewpoint on that author not having consented to reuse, but not much more sympathy than I have for a mosquito, certainly not the reverence I have for a praying mantis. Some discussion on that topic can be found here. We're in a new world here, where "rip-and-reuse" is acceptable when it leads to new creative/informative heights. All we care about is carefully documenting where the original input came from. I haven't yet seen a source discussing whether or not text (or images) released to the public domain are compatible with GFDL or CC-BY licensing. Isn't that an important consideration as to moral rights?
Commenting on several KB above (to Kablammo): for "public domain text", substitute "WP editors unfamiliar with our standards" and tell me how it reads differently - should we deprecate them too? ; and regarding "recognition and promotion" of articles, that's a spurious argument. Recall that recognition and promotions are means to an end, not the ends themself. We want quality articles, that is all that matters. If the R&P system conflicts with that, then fix the system - for instance, make clear to editors that articles with substantial portions of unquoted PD text will not be promoted. Maybe scrap the entire DYK process. But please don't get in the way of creation of new and valuable articles if they're based on high-quality writing. If I openly say I copied it from a PD source, by all means don't give me a star or bangle, but don't tell me I was wrong to do it. I find nothing in policy to support that position. Franamax (talk) 03:20, 17 April 2009 (UTC)[reply]
By all means, use PD content. Paraphrase it and cite it, or quote it verbatim and cite it. Do that out of respect for its original authors. Is that so hard? Kablammo (talk) 03:24, 17 April 2009 (UTC)[reply]
I agree that high quality articles are the end goal of Wikipedia. I consider part of a high quality article to be the careful quotation AND attribution of sources, however. I think it is important to document who is using what kinds of language as accurately as we can. Awadewit (talk) 04:09, 17 April 2009 (UTC)[reply]
(e/c) Yes, your second method is hard, because it prevents development of that original text into a working article! You presuppose that all wiki editors have a natural facility with language and are able to easily integrate and reword numerous sources, or that an editor can take a single source and successfully paraphrase it - else you confine them to the straitjacket of placing PD text into inviolable quotes, to be preserved foreverafter, complete with the "niggers" and "flat Earth"'s.
I'm talking here about new article creation - do you seriously think that if a text on, say, insect species becomes PD-available that we should not expand our article repertoire to reflect it? You indicate a binary response: either the adding editor must adequately paraphrase and cite each article to the source (and for insect articles, there's not too many ways to say it, so they will probably get accused of plagiarism anyway); or they must quote the exact text and laboriously indicate around those quotes how we have now realized that the original family placement was taxonomically incorrect until someone is brave enough to completely reword every single statement in that well-written and otherwise perfectly accurate quoted text? I suggest that you are placing an undue burden on our editors, who may not be as experienced at writing as yourself. I further advocate the third way - carefully document the origin of the PD work, incorporate it verbatim and begin the process of merciless editing. Proper attribution to those others who created the original work, to me shows all due respect for the original authors. I wish though to show even greater respect - I think your work valuable enough that I want it to be the solid base for an article at the ultimate collection of human knowledge, and I'll make sure you get credit for it. Where's the diss in that statement? Franamax (talk) 04:25, 17 April 2009 (UTC)[reply]
Yes, writing is hard, but, again, I fail to see why that means we should abandon our ethical duty to clearly attribute and precisely quote authors. Considering all of the work that you have just said goes into writing, I would think that you would agree we should show respect for authors' words and clearly indicate when we have copied them. Writing can be elegant and inspiring - and the authors of such writing should get credit for that language. Writing can also reveal the biases of a time or place - and that should be recorded as well. Awadewit (talk) 05:00, 17 April 2009 (UTC)[reply]

(undent) Did anyone bring any chips? I'm a bit hungry. Ling.Nut.Public (talk) 06:27, 17 April 2009 (UTC)[reply]

Well OK. Let me see. First off, hello to everyone. Second, I often sound gruff/dismissive/insulting etc. Apologies in advance. Now: I don't care what Wikipedia's mission is. You really shouldn't either. Higher issues are certainly at stake, and arguments based on any particular reading of Wikipedia's mission are both self-important (defining "self" as "Wikipedia, collectively"; not as the individual(s) making such statements) and wholly irrelevant. Second, I think we're all doing too much talking and not enough reading. I think we all need to take a rain check on this entire thread.. and go out and buy or borrow relevant books. Here's one that looks relevant: Wirtén, Eva Hemmungs. (2008). Terms of Use: Negotiating the Jungle of the Intellectual Commons. University of Toronto Press. ISBN 0802093787. I'm sure we can find others. Speaking very frankly, if we don't do some in-depth research on this question, then we are all just barking empty opinions at each other. In that case, this thread is little more than physical exercise— and not particularly therapeutic exercise, at that. There. Done. Chips? I like sour cream 'n onion dip. Ling.Nut.Public (talk) 07:29, 17 April 2009 (UTC)[reply]
Let me go further (see above for my basic view). I suggest that y'all start a collaborative reading circle. Start a subpage in someone's user space (preferably neutral territory). Pick a reading list. Invite folks from relevant forums. Everyone goes out and buys or borrows the books. Talk amongst yourselves.. but this time, do so in an informed manner.. ;-) Agree that by no means will you come to any conclusions; you're only discussing the books. Take a couple months. Then come back later and argue, preferably in a forum more public than this one. Later! Ling.Nut.Public (talk) 07:45, 17 April 2009 (UTC)[reply]
Ling.Nut, I can do a pretty fair turn at gruff/dismissive/insulting myself. :) Screw books. Screw reliable commentary on what we're doing here. We are doing it. US - we, me, you, everyone else participating - we are the people the books will be written about. And books are just sooo 20th-century. All 1300 of them carefully shelved around my house, read and re-read nigh to destruction. I love them all, but we've entered a different age.
We have begun an era of "rapid-reuse" and Wikipedia itself is the living proof. I'll continue my personal ethos of strict detail and close sourcing (which explains why I find it so difficult to change so much as a sentence) - but nothing about Wikipedia itself engenders this latter-day carelessness. Tis the times and the fashion People just can't be bothered. I'll likely end up on your side of the widening pond asking for proper attribution and correct sourcing - but I won't necessarily be quoting a book that doesn't even play a tune when I open it ( :) :( ).
Awadewit, you pretty much nailed exactly my thoughts. But it seems that we disagree on putting in those quote marks. My thoughts on the best way to do it are most recently here. Does any of that fail "our ethical duty to clearly attribute and precisely quote authors"? Because that is exactly what I want to achieve. However, I do wish the opportunity to further modify that text to achieve our encyclopedic goals. With ref to my strict attribution criteria laid out at WT:P, I don't see the possibility of confusion between the efforts of PD authors and wiki-authors. We just need to be very clear as to who did what.
Oh yes, speaking of fora, I chipped in my $20 last December, I'm pretty sure Brion has now set up an entire server just for Wikipedia talk:Plagiarism. There's the second hint, should I try proxy-IP page blanking next? :) Franamax (talk) 10:07, 17 April 2009 (UTC)[reply]
There is an interesting discussion here on the importance of attribution for use of PD-US works, including these comments:

…there is a huge volume of uncited volcano info on Wikipedia that was actually taken from the GVP website either directly or indirectly…

The stuff on the GVP site is created by US govt employees and is in public domain, though many of the photos have individual copyrights…. We're an information service dedicated to the diffusion of knowledge, but getting proper credit is important so we can continue to justify our existence to maintain the only reliable database on volcanoes and eruptions.

It appears that most of the participants here agree that citation is important, but disagree on whether (a) verbatim text should be in quotes, and (b) whether, as a matter of best practices, a general citation template is an adequate substitute for specific citation. Is it fair to say those are the issues on PD text? Kablammo (talk) 13:02, 17 April 2009 (UTC)[reply]
I would phrase the issues slightly differently A) whether verbatim free-content text used outside of original writing should be in quotes. B) How to best attribute the authorship of externally written text given the constraints of the current attribution system which is sub-optimal for all authors. None of this specific to PD text, it is just as true for GFDL text. And we may find CC-by-SA text imports more common than PD ones in the future.--BirgitteSB 13:23, 17 April 2009 (UTC)[reply]

(undent)...and I repeat myself: The answer exists outside of us, not within our WP:CONSENSUS. Somewhere there are resources that answer these questions explicitly. Find them. read them. Follow them. Ling.Nut.Public (talk) 13:26, 17 April 2009 (UTC)[reply]

I agree completely, and whether something is or is not plagiarism needs to be decided by reliable sources, not Wikipedia definitions or our own reasoning. But the very term "plagiarism" is a freighted one, and if there is an actual solution which avoids hard feelings we should explore it. Kablammo (talk) 13:32, 17 April 2009 (UTC)[reply]
OK! Physchim62 (talk) 14:07, 17 April 2009 (UTC)[reply]
So find the rule and follow it. And whisper sweet words into the ears of those who don't. And give them a barnstar. Then delete their stuff.Ling.Nut.Public (talk) 13:43, 17 April 2009 (UTC)[reply]
Plagiarism is an inherently subjective moral issue that faces new challenges of definition in the specific context of Wikipedia. With all due respect, you ain't gonna find the answer in a book. Dcoetzee 04:33, 18 April 2009 (UTC)[reply]
Blah blah blah. Blah blah Wikipedia blah new blah wonderful blah changes all the rules blah blah we're special blah blah we're cutting edge blah.. If I had a dollar for every self-congratulating post on Wikipedia about Wikipedia, I could buy a nice house. Ling.Nut.Public (talk) 10:21, 18 April 2009 (UTC)[reply]

Plagiarism can be fought by other means, and does not need a "moral crusade"

[edit]

Much of the debate has been about our "moral duty" to fight supposed plagiarism. Frankly, that gets my back up, I don't like being told how to be moral by other people, and I don't think I'm alone that dislike! ;) In a Wikipedia context, I find it particularly disturbing that we should have a moral crusade against, say, plagiarism, and not one against the moral issues which I find more pressing, such as respect of privacy (especially for legal minors) or malicious edits from individuals or organizations with a clear conflict of interest. But still, I shouldn't fall into the trap of forcing my morals down other people's throats, and we are here to discuss a specific dispatch about plagiarism.

I am still less than convinced of the need to do anything at all over and above what we should be doing already, yet the urgency of such action seems to be the axis of the dispatch we are discussing here. We should cite our sources, probably even more so than we do when we write academic articles in our own disciplines (as many contributors to this discussion do professionally): nobody is disagreeing on that. But where Awadewit seems to see the heinous academic crime of plagiarism, I simply see an article which needs to be improved for our users. I agree with Birgitte that it seems to be hyperbole to even call it "plagiarism" at all, given the context of Wikipedia: they are simply articles which could be improved.

The problem is that article improvement is rarely a matter of removing a single type of fault, nor even of applying some centrally unified set of criteria. Just because article improvement is difficult doesn't make it any less important for the future of this encyclopedia. For me, "plagiarism" at Wikipedia is bad for two reasons:

  • the "plagiarised" text rarely fits with our other essential criteria, such as WP:NPOV and WP:V (the latter is a minor problem if the source text is readily available online, but not if it only exists in dead-tree form);
  • if Wikipedia is to be useful, and therefore justify the huge amount of time its volunteers spend on it, it should be offering something different from what is available elsewhere on the Web, not simply reproducing long verbatim texts. Wikisource exists for that, and its role is complementary to the other Wikimedia projects, but Wikisource also tries to "add value" to the texts that it hosts. Wikipedia should also be looking for ways to add value, not simply to reproduce the material which can already be found through Google.

I am interested to see if the authors of the dispatch can find some elements in there to agree with ;) Physchim62 (talk) 14:07, 17 April 2009 (UTC)[reply]

The essential question here is not whether Wikipedia should reproduce other public domain sources - this isn't very useful except as a means of centralization - but whether it should be used as seed material which is then edited into a consistent, interlinked, modernized article. This has happened with many EB1911 articles and I believe it's a valuable way of writing an article - others believe this practice is either amoral or simply a poor practice. There is clearly little consensus around this point (although the status quo is on my side) but there's clear consensus about many other points involving plagiarism (such as the need for attribution and not closely paraphrasing non-free sources). Dcoetzee 04:30, 18 April 2009 (UTC)[reply]

Some thoughts

[edit]

1A) When student takes a Wikipedia article and improves it only somewhat and turns it in as an assignment citing Wikipedia as a reference, this is plagiarism. 1B) When editor takes a Wikipedia article and improves it only somewhat and saves those changes on Wikipedia, this is not plagiarism. 2A) When a student takes CC-by-SA article from Citizendium and improves it only somewhat and turns it in as an assignment citing Citizendium as a reference, this is plagiarism. 2B) When editor takes a Citizendium article, after Wikipedia becomes dual-licensed, and first dumps and in later edits merges it into a Wikipedia stub, which only contributes to the merged introduction, attributing the Citizendium author, this is not plagiarism. 3A)When a student takes a PD article from the internet and improves it only somewhat and turns it in as an assignment citing the website as a reference, this is plagiarism. 4B)When editor takes a PD article and first dumps and in later edits merges into a Wikipedia stub. which only contributes to the merged introduction, attributing the PD author, this is not plagiarism. What is the difference you all see between the B statements that some can be disagreed with and not others?--BirgitteSB 19:53, 18 April 2009 (UTC)[reply]

I think it's a philosophical issue for some editors, who feel that text that is the work of others should be either strictly quoted or fully paraphrased, one or the other with no grey areas. I have some sympathy with their desire that all work here should be original (not original research, but original writing) - but that's not how the project has worked in the past and many editors here now are not necessarily extremely good original writers but can still be very valuable contributors by providing PD "scaffolds" on which our better writers can eventually construct high-quality articles. I think it's important to note too that many people are interested in constructing all the myriad things other than written text that make the 'cyclo so valuable: wikilinks, categories, lists, navboxes, etc. None of that gnomery can be done if the article doesn't exist in the first place. Again, I think some editors are philosophically opposed to that too and focus on the "encyclopedia article" part of things. IMO, we need both approaches and PD-imports when carefully noted so the origin is clear complement both approaches. Franamax (talk) 21:41, 18 April 2009 (UTC)[reply]
Here we go again. I think this, you think that. We act as if our opinions are binding... which is an artefact of WP:CONSENSUS. Our opinions are not binding. Laws are binding. Generally accepted practices should be binding. We should be researching these. We are wasting everyone's time here, though it's admittedly quite entertaining. Ling.Nut.Public (talk) 01:31, 19 April 2009 (UTC)[reply]
Sorry Ling, but the only person's time you can choose to waste is your own. I by no means share the sentiment, I choose to use my time here, because I feel the topic is important and good-faith discussions may provide a solution. You may see it differently and wish to dismiss my efforts, that's OK too.
Laws are certainly binding, particularly those laws governing the operation of a private website, no argument there. GAP's, yeah they should be binding within the context where they are practised. For instance, we don't apply shariah law here on this website, I'm not aware of any instances here where a woman's testimony is worth half that of a man. Beyond sarcasm, an appeal to authority to be found in some ideal book you envision, and a general wish to shut this whole page down - what's your point? Despite your pooh-poohing, Wikipedia is an unprecedented occurence - or perhaps you have an historical analogy? No self-congratulation is involved, rather an honest attempt to grapple with new circumstances. Do you wish to participate in the solution? Franamax (talk) 02:24, 19 April 2009 (UTC)[reply]
Well no one has a problem not strictly quoting or fully paraphrasing when editing the work of others on Wikipedia, which is the example in 1B. So that is either not the philosophical issue involved or the concept of Wikipedia must be examined as plagiarism. I think the answer is the former.--BirgitteSB 03:57, 19 April 2009 (UTC)[reply]
It is a self-congratulating fallacy to insist that Wikipedia is something so new that it changes all the rules. Wikipedia is nothing new. At least, in the context of this discussion, it is nothing new. Wikipedia is a collection of pages. Each page has text on it. The text on the page may change a lot (sometimes in a short period of time), but each moment is a snapshot of a textual document. Within any given snapshot, Wikipedia either complies or does not comply with laws and/or generally accepted practice. As for the authors.. what is new about us? Why are we new in any way? We are not new in any truly meaningful way.. There have been anonymous authors since forever and ever amen. I'm sure many news articles without a byline are written by more than one author, the same as Wikipedia. I'm sure those documents undergo several revisions just as Wikipedia articles do. The only diff – and it is a tiny one – between Wikipedia and an anonymous article is that readers can see under the hood, and see who (anonymously, usually, at least in terms of real-life identities) added what at what time. People can see every incremental snapshot, to continue the analogy. Please, let's all stop flattering ourselves. Wikipedia is wonderful for two and only two reasons: it offers tons upon tons of (very often useful) info, and it does so totally free (well, at least, to anyone who has an Internet connection, so in that sense it isn't totally free, but nearly so). The world doesn't need to change all its rules because of us here at Wikipedia, though we would all very dearly love to believe it does. Ling.Nut.Public (talk) 06:01, 19 April 2009 (UTC)[reply]
Yes, by all means, let us follow the law in this regard and abide by the wishes of the owners/originators of the externally produced content. So when the owner/originator of externally produced text decides to give away their text as free content that allows modification, then we can use that as such pursuant to the terms they gave and in a way that does not present that content as originally authored by the uploader of the text here. IMO, attribution templates (perhaps with a diff to the original upload of unmodified text), edit summaries, notices on talk pages and notes at the end of inline cites are more than enough. The only gray area I see is for copyright expired content b/c we can't know the wishes of the previous owners of the text. --mav (talk) 15:49, 19 April 2009 (UTC)[reply]
No Brigitte, you have simply not specified the possible cases broadly enough to encompass everyone's objections yet, it is you who has failed. ;) I think the crux revolves around your 4B and even then it's not fully specified. Your 4B (and maybe you meant 3B, whatever) is about dumping PD-text for a portion of an article and then reworking it, if I understand correctly. I have three objections:
  • 4B. ...dumps...and merges...only...to the merged introduction - that is surely a special case or perhaps your own common experience, but definitely not what I've mostly seen or considered.
  • So one general case is where we dump free text to start an article or substantially expand an article. This is my canonical case, where the playing field is open, we can do the careful attribution, then proceed with merciless editing.
  • Another case is adding free text to an existing (and let's say "mature") article. This to me is much more problematic. If the article already has a leg to stand on, at that point I'd be much more likely to insist that you conduct proper authorship. If you're talking about changing a two-sentence viable stub into a four-sentence viable stub, at the expense of permanently maintaining a PD-attribution template, I'd say no! For some reason, I've always been thinking of substantial additions to new articles. Nice one BSB, you've just made it even more difficult. :) We very much do need to codify where PD-copies are acceptable, maybe you could chip in here.
Your 1B, editing within WP itself is easily covered by the strict terms of WP:GFDL - the Title and History sections are duly maintained (they're a little smeared up, I wrote a minor treatise on that once, but it's all there, our software does proper GFDL). GFDL exactly handles plagiarism - I am the Author of the Document under a new Title and I preserve the History section which includes the names of at least five of the previous major Authors. Something like that, I can look it up, but in a single MediaWiki article, it's not possible to plagiarize another named author, it's an inherent property of the implementation. BUT: But you missed a segment - unattributed copying between articles, unattributed interwiki copying, improper paste-merges... these are a huge problem and possibly will constructively invalidate the entirety of our GFDL license eventually. As far as plagiarism goes, silent internal copying is just as much a concern to me as silently copying Copernicus or Galileo to "improve" an article. We shouldn't do it. Franamax (talk) 07:21, 19 April 2009 (UTC)[reply]
Brief nod to Ling.Nut capering about manically. Yes, I'm watching, do your magic trick for Daddy. Franamax (talk) 07:21, 19 April 2009 (UTC)[reply]
I understand you are frustrated but personal attacks do not help. --mav (talk)

(undent) Wow. I thought I was rude. Franamax. I'm stunned. I'm speechless. No. Truly. I don't know how to reply, because.. because your remarks have so little connection with reality... no connection, really, none at all. No connection. None. Whatsoever. Moreover, if I gave a simple repetition of objective facts (see above) it would still seem like tit-for-tat. The real world exists outside Wikipedia, dude. Sincerely. Ling.Nut.Public (talk) 10:15, 19 April 2009 (UTC)[reply]

Yes, that was quite rude, wasn't it? I strike and apologize to you and any others offended. I am a little frustrated with your latter approach Ling, where you seem to be pouring scorn on efforts to develop a solution that works within Wikipedia. You've said that we just need to find some text out there in the "real world" that will answer the questions, and then you've started saying (and I quote) "blah blah blah". I don't find blah blah blah all that helpful - but I shouldn't go on to be even less helpful, for sure.
I disagree with you that there is a ready-made solution sitting out there in the real world. Massive-multi-author collaborations are a relatively new phenomenon, they don't have the benefit of 100 years of thought, and in the real world views on copy-and-extend are in flux right now.
And yes, the real world exists outside of Wikipedia - but that real world has noticed this site. The Economist writes articles about the site and points to it as the example for MMA collaboration. Only the GNU project is of similar scale and under GPL I believe they expect and encourage copy-and-extend. Franamax (talk) 16:39, 19 April 2009 (UTC)[reply]
You need outside information. You are not seeking outside information. Until you do, you're just talking through your hat. It's a bull session, and bull sessions are very very fun, but it has no meaningful result. I'm not saying anyone's positions are right or wrong; I'm saying all uninformed positions are irrelevant. Ling.Nut.Public (talk) 02:07, 20 April 2009 (UTC)[reply]
  • I was ill and wasn't able to follow up on this before it went off-track to where I do not have a good spot to reply. The terms of the GFDL, or any copyright situation for that matter, have nothing to with plagiarism. In 1A even if the student received the explicit permission of the Wikipedia authors to plagiarize their work by contacting them on their talkpages and receiving a separate license of their copyrights for such a usage, it would still be plagiarism. Yet in 1B, no one is claiming the regular workings of Wikipedia are plagiarism. Also in 3A the terms of a release in to the public domain allow such usage (not all PD is due to expired rights) just like the GFDL allows the activities in 1B, yet 3A is recognized as plagiarism and 1B is not. Copyright and licensing is irrelevant to plagiarism. So what is the objection that people see with 3B? I have outlined why it cannot be copyright, so what is it?--BirgitteSB 14:53, 20 April 2009 (UTC)[reply]

A very concrete example

[edit]

The article Critical Analysis of Evolution at this stage of development was written by a single editor who releases their contributions into the public domain. If a student expanded on the that text and turned it in for a class assignment without complete re-writing and quotation (which is allowed under a public domain release), it would be considered plagiarism. Yet this expansion on Wikipedia (also allowed under a public domain release) is not considered plagiarism. I have stated a solid rationale for why this can be accurate based on claims authorship credit, but people dispute that rationale. How else can the one be described as plagiarism and not the other?--BirgitteSB 18:35, 20 April 2009 (UTC)[reply]

Simple. Wikipedia is not a class assignment and nobody here is advocating reuse of free content w/o noting that reuse. The student could note the reuse, but then he would likely get a failing grade b/c he didn't do the research and writing. That should be repeated; the intent of the assignment was to learn something by researching and writing. The drafted text is merely a means to judge the quality of that research and writing. Our reuse, however, does have a primary aim of ending up with a product in the form of an article. In our case, the research and writing is a means to an end but in the case of the student the research and writing is really the whole point. Product vs process. --mav (talk) 01:45, 21 April 2009 (UTC)[reply]
Yes, although I'm on your side in this debate (that we should be able to use PD text as the basis for an article), the justification isn't entirely about attribution - the point is that students write articles to learn, and we write articles to teach. A far better comparison is between a Wikipedia article and a published academic survey paper (i.e., a paper that reviews the existing literature, without making a contribution of its own). In such a paper, strict sourcing and indication of authorship is expected, but they can and often do lift entire sections wholesale from previous works if they have the rights to do so, as a means of expeditiously summarizing background or whatever. Dcoetzee 05:30, 21 April 2009 (UTC)[reply]
Per (mostly) mav, if the student put a note on their paper saying "this document incorporates work from the public domain" and indicated what the copied work was, it would imo certainly not be plagiarism. It would almost certainly garner a failing grade.
In the wiki case, the later editor is only claiming authorship credit under GFDL, which precludes any claims of plagiarism when republishing a previous GFDL work. 0. PREAMBLE says "...preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others." So long as you properly propagate the Secondary Sections (Title and History in this case, which is done by the software) - plagiarism doesn't enter into it. However, if that second editor had created an entirely different article by copying the first article and at the same time including their own modifications, without attribution to the seed text from the first article - yes, that's plagiarism. Within the normal wiki-editing process, authorship is only ever claimed for the Modified Version, which is explicitly referenced to the Document by means of the Title and History Sections. If you comply with GFDL, you can never plagiarize, ever. Franamax (talk) 08:13, 21 April 2009 (UTC)[reply]

Restated facts

[edit]

"The sky is blue." How can one restate this statement without technically plagiarizing? Could it be that the English language itself is limited in expressing ideas. Someone brought up the point to me that a restatement of facts is not plagiarizing. Words are reused a billion times a day, even me writing this message, people have used everyword I am typing in sentences. In my opinion, plagiarism is stealing a persons idea and taking credit for it, not restating the facts. Cmguy777 (talk) 21:22, 28 September 2010 (UTC)[reply]

  • "Blue is the sky."
  • "Blue sky."
  • "The stratosphere is at a 475nm wavelength."

[2] [3]

  1. ^ Fleming, James R. (2006). "Review of Superman on the Couch: What Superheroes Really Tell Us about Ourselves and Our Society. By Danny Fingeroth". ImageText. University of Florida. ISSN 1549-6732. Retrieved 2009-04-13.
  2. ^ "What Wavelength Goes With a Color?". Retrieved 2010-09-28.
  3. ^ "Meteorology Earth's Atmosphere". Retrieved 2010-09-28.
The WP standard is not as you present it, and our personal opinions do not matter. THe first two statements are common knowledge to any WP-reading-aged, nominally intelligent individual. The third (scientifically nonsensical) statementl, should it appear in WP, would not in any sense be common knowledge, and would require a source. There is no choice in such matters. The last is result of a measurement, and is interpretable only in the contest of physical theory. It is not common knowledge, per WP policy and guidelines. Le Prof Leprof 7272 (talk) 00:42, 18 October 2015 (UTC)[reply]

Please advise

[edit]

There is no specific template message to tag for plagiarism? Does one need to start a separate Wikipedia plagiarism website to have these matters taken seriously? Please see the Wars of Cyrus the Great, where earlier in the day were found large blocks of text taken verbatim from an 1881 online text—situation at first resolved by converting two sections to long quotes, though this makes the section content based on historiography 130 years old—and on reviewing, I found a paragraph taken all but verbatim from a recent 2012 scholarly text. Given the remaining large blocks of text with few or no sources, it is likely that the rest of the article will be similarly unmasked. In short, the article should be pulled as a plagiarised piece. Copyvio tags were used to mark content, but this seems a misuse of these tags. How does one show that Wikipedia is taking this seriously? Le Prof Leprof 7272 (talk) 00:33, 18 October 2015 (UTC)[reply]

Looking beyond this one article, it looks as if all of the Cyrus military history material is similarly suspect, to one degree or another. Leprof 7272 (talk) 00:34, 18 October 2015 (UTC)[reply]
Postscript, the reasons the citations look in good shape—this is deceptive, look in the edit history to the article's state before today—is that the existing citations, while being checked, were also completed (i.e., not left without date, author, publisher, title, URL, etc.). They appear "clean," not because the article is clean, but because they were made clean in determining the article was "dirty." Leprof 7272 (talk) 00:38, 18 October 2015 (UTC)[reply]