Sponsored Links

Minggu, 12 November 2017

Sponsored Links

Wikipedia | Make Software, Change the World! | Computer History Museum
src: images.computerhistory.org


Video Wikipedia talk:Plagiarism/Archive 7



Phrases 'may' require quotation marks

SV you made this change which inserted "may" into "Note that even with inline attribution, distinctive words or phrases may require quotation marks." Have you see what is written in the policy section of WP:NFC

Articles and other Wikipedia pages may, in accordance with the guideline, use brief verbatim textual excerpts from copyrighted media, properly attributed or cited to its original source or author, and specifically indicated as direct quotations via quotation marks, <blockquote>, or a similar method.

Below the poliy is a section called "Text" which seem to confirm the wording here ("Copyrighted text that is used verbatim must be attributed with quotation marks or other standard notation, such as block quotes") without the addition of the word "may". It seems to me that you need to address the issue in WP:NFC, and then modify this one once a change is made to that page. -- PBS (talk) 22:16, 29 October 2010 (UTC)

Regardless of what any page says, it's a matter of fact that writers do not need to use quotation marks to signal what someone said. Indirect speech is perfectly fine. There is no difference between "Philip said the shop had shut," and "Philip said 'The shop has shut.'"
One of the very frustrating things about WP is people trying to reinvent the wheel. We don't have to work everything out from first principles. :) SlimVirgin talk|contribs 22:22, 29 October 2010 (UTC)
There's no distinctive phrases in there, though. It was distinctive phrases we were discussing above and for which I supplied sources. (Plagiarism isn't a matter of fact, really. That cats are mammals is a "matter of fact," but plagiarism is not codified by law. It's highly subjective.) That said, I don't really object myself to the addition of the word "may" in that context. That section is about how to avoid any dispute of potential plagiarism, and it has the same effect whether the word "may" is included or not: if people don't mark distinctive phrases and encounter plagiarism disputes, they'll know they should have. --Moonriddengirl (talk) 22:36, 29 October 2010 (UTC)
Sv assuming the Philip is not me how do I know if Philip said exactly that or if you are summerising? For example "Sir Robert Armstrong said he told the truth" is a summary of "Sir Robert Armstrong said he was 'economical with the truth'" but if I have no rule in place how do you know if the former is a summary or his words? In this example, I assume that the sentence with the quoted words "economical with the truth" are what he said and the rest is a summary of what he said. -- PBS (talk) 00:05, 30 October 2010 (UTC)
And if you want to add quotation marks to "economical with the truth," you're free to do that. This policy is about how we guard against plagiarism. The way to do that is to add in-text attribution when you're using another person's words, or close to them, and if you want to add quotation marks too when it's direct speech that's fine. SlimVirgin talk|contribs 01:27, 31 October 2010 (UTC)
We need a bright-line here. Quotes should be specifically denoted as such, without exception. Any other policy will lead to confusion and likely adulteration of the verbatim text by later edits. Neither are as likely to occur if quotes are encased in quote marks. Kablammo (talk) 19:56, 6 November 2010 (UTC)

Maps Wikipedia talk:Plagiarism/Archive 7



Difference: Copyright and Plagiarism Issues

This guideline is constantly mixing up plagiarism and copyright issues which are totally different from one another. I am surprised I should be the only one noticing that (didn't read the archives, though). Detailed reasoning here. --Pgallert (talk) 10:57, 4 November 2010 (UTC)

I'm not that comfortable with the use of that term either and personally I consider a "plagiarism" only as a problem, when it represents a copyvio of some sort or contains unsourced content as well. But for those 2 cases we have separate guidelines anyway. However guideline is around for 2 years now, not that long that I would consider it a core RL, but I guess it has to be considered as somewhat established. --Kmhkmh (talk) 11:08, 4 November 2010 (UTC)
There are two distinct problems that the editors of Wikipedia face. The first is blatant copyright infringements, and minor changes to copied content used to disguise such infringements, which a malignant editor attempts to pass off as their own work. Often the issue is clear cut copyright infringement as well as plagiarism. This is fairly easy to detect and fix but extremely time consuming (and if you ever get involved in such cases with a serial offender I think you will find it rather unpleasant). So yes the two concepts are mixed up together.
My Sweet Lord! The second problem is more subtle and comes down to editorial judgement. If an editor is trying to add a paragraph (particularly if it is about a controversial subject) for which they have access to perhapse only one source, they have two pressures on them. The first is to summarise the source or sources accurately in such a way it does not fall foul of OR in particular syn. But pressing the other way is the need not to plagiarise the source(s). The line between the two can be a fine one and comes down to editorial judgement which may need input from several editors to reach a consensus over whether the wording in the articl is plagiarism or not. But usually if the judgements are made in good faith then as this guideline makes clear they can be sorted out harmoniously ("Peccavi" (Charles James Napier)). A major indicator of a good faith mistake (which can be sorted out amicably on the article's talk page), and a malignant edit, is when the person who has made what others consider to be plagiarism, refuses to discuss the issue and suggest alternatives as compromises. Often such an initial negative interaction can be a red flag, that on investigation leads to other problems and often copyright issues -- that takes us back to my previous paragraph and delinquent editors if not outright malignancy.
This guideline also addresses the issue of whether it is ethical to copy copyright expired text into Wikipedia. If you look through the talk archives of this talk page, you will see that the consensus view (but not unanimous view) of editors is that it is OK to copy in to an article text from another source without quoting it, providing it is not a breach of copyright, meets the content policies and the source is properly attributed (as defined in this guideline). The archives will show that the consensus was that copying copyright expired text was not plagiarism providing the copy is adequately attributed, but to copy the text and fail to give adequate attribution is plagiarism. -- PBS (talk) 22:20, 4 November 2010 (UTC)
Kmhkmh there is a difference between adequate citation and adequate attribution as demonstrated by these two templates {{cite DNB}} and {{DNB}} (with the "inline=1" switch set):
  •  Dictionary of National Biography. London: Smith, Elder & Co. 1885-1900. 
  •  One or more of the preceding sentences incorporates text from a publication now in the public domain: Dictionary of National Biography. London: Smith, Elder & Co. 1885-1900. 
All DNB text is out of copyright so it is not covered by the WP:copyvio guidline, and as can be seen WP:CITE does not cover the issue of copying DNB text only that DNB is being used as a source. Hence the need for this guideline. --PBS (talk) 22:32, 4 November 2010 (UTC)
I'm aware of that difference, however I'm saying it doesn't really matter much or rather personally I see both versions as somewhat "adequate" for WP's primary goal. From my point of view both versions are sufficient for WP's primary goal (providing free access to the world's knowledge in a legal and verifiable manner), putting a big emphasis on the "minor" difference between those is imho WP:CREEP.
I can understand why some people, in particular those involved with DYK, GA, FA reviews, feel a need for such a guideline and I understand that they do not want reward "cheaters", who ripped of their work from somewhere else. I can understand that at this level the difference in the 2 attributions might matter and that the use of the "more adequate" one might be required.
However this a general guideline and not just one for reviews or awards in particular. From the general perspective I only care about that we provide free knowledge in a legal and verifiable manner. I care about an article's content itself being a correct and unbiased description of its topic, but I don't care that much about the details of the creation process and whether it is "original" work. Meaning whether somebody "cheats" by ripping off a part from the public domain or getting his or her big brother or sister to write it, whether it matches some academic honour code or satisfies some personal vanities isn't really of much concern for me nor imho should it be for WP on that level.
--Kmhkmh (talk) 00:42, 5 November 2010 (UTC)
For me the difference between the two is that one is a legal right, the other a moral right. They overlap in a plagiarised copyvio, but as PBS said, do not when the source is PD. I support distinct guidelines on the two subjects, since they are distinct subjects. Meanwhile it would be useful if Pgallert could point to places in this guideline which mix up the two concepts. --Tagishsimon (talk) 00:48, 5 November 2010 (UTC)
We all agree that they are distinct subjects with an overlap and we all agree that we need to deal with copyvio and proper sourcing for verifiability purposes. What we don't agree on is the exact nature of concerned "moral rights" (and more precisely whose moral rights exactly?) in this context and whether WP should deal with that issue at all. Or to rephrase my comment above in a slightly different manner, WP's business is provide free and correct knowledge/information, promoting/enforcing "moral rights" are not really part of its agenda. As far as the mixing is concerned, just check how many chapters of this guideline deal with various copyvio aspects.--Kmhkmh (talk) 01:48, 5 November 2010 (UTC)
I guess this is a spillover from Wikipedia:Administrators' noticeboard/Incidents/Plagiarism and copyright concerns on the main page and User talk:Jimbo Wales#Copyrights and plagiarism to name but two forum where there have been conversations over this in the last few days.[1]. Kmhkmh as you know (because you have been active commenting there) some people hold the views you hold there are others who hold diametrically opposite views (see for example the comments by User:Fiona United). [2] As it take little to no more effort for an editor to use one of the attribution templates, as not (see the example above {{cite DNB}} and {{DNB}}), why not use them if that is a compromise that we all can live with? -- PBS (talk) 03:00, 5 November 2010 (UTC)
Don't get me wrong I'm fine with recommending authors "more adequate" template or editor switching to that version in an article. What I'm not fine with however is, that this minor and from my perspective mostly technical difference becomes a big issue requiring its own guideline with a difficult name, that has a lot of baggage and associations. Even more if is even considered banable offense and if there's a constant problem of mixing with other wider notions of plagiarism such as plagiarism of ideas, concepts, structures. The existing of such a plagiarism guideline also provides a constant incentive for people with strong views on the subject (for which however no clear community consensus exist) to incorporate them into the guideline. That's why I prefer to scrap this guideline altogether in favour of the a "text theft"-policy, that Hans Adler has suggested.--Kmhkmh (talk) 03:34, 5 November 2010 (UTC)
I'm happy this suggestion comes up. We do not need any instruction on plagiarism, the three existing pieces, OR, SYNTH, and (a future, improved, and possibly promoted, version) of Wikipedia:Close paraphrasing (has already substantially be changed yesterday) cover all possible cases. Maybe a short notice can remain. It is also important to point out that what R. did (don't wanna beat dead horses but this needs to be pointed out) was vandalism, not plagiarism. Maybe a notice somewhere in WP:VANDAL should mention that. --Pgallert (talk) 07:04, 5 November 2010 (UTC)
@Philip, there is a huge difference between {{cite DNB}} and {{DNB}}: One template acknowledges that you got the idea from there. The other one mentions the possibility that you got the text from there (the idea+the wording). That's exactly the difference between plagiarising and not plagiarising. --Pgallert (talk) 07:04, 5 November 2010 (UTC)
I am not clear what you are saying. In my opinion if text is copied from DNB and there is not adequate attribution (such as given in the template {{DNB}}) then that is plagiarism, but if the text is copied and adequate attribution is given then that is not plagiarism. Is that your understanding? -- PBS (talk) 11:37, 5 November 2010 (UTC)
See examples below. {{cite DNB}} attributes the idea, the facts as presented. {{DNB}} additionally attributes the wording. If something is closely paraphrased, only {{DNB}} can be used. --Pgallert (talk) 12:15, 5 November 2010 (UTC)
  • {{cite DNB}}:  Dictionary of National Biography. London: Smith, Elder & Co. 1885-1900. 
  • {{DNB}}:  This article incorporates text from a publication now in the public domain: Dictionary of National Biography. London: Smith, Elder & Co. 1885-1900. 
@PBS, that second problem must be solved at WP:OR, not here. Original research is neither legally nor morally wrong, it is just not wanted by WP. Copyvio is the former, paraphrasing a free source is the latter. I feel it is the OR policy that needs to give if that problem cannot be resolved by a clear content guideline. --Pgallert (talk) 07:04, 5 November 2010 (UTC)
I follow what you have written up to "paraphrasing a free source is [morally wrong]". Do you mean plagiarising a free source is morally wrong? If the source is under the same or similar licence as Wikiepdia, for example Citizendium, and adequate attribution is given, is copying text from Citizendium for further editing on Wikiepdia plagiarism in you opinion? -- PBS (talk) 11:37, 5 November 2010 (UTC)
Sorry, I was too muddy there. "Closely paraphrasing a free source without explicitly acknowledging the close paraphrasing is wrong" -- that's what I wanted to say. --Pgallert (talk) 12:15, 5 November 2010 (UTC)
Why?--Kmhkmh (talk) 14:44, 5 November 2010 (UTC)
Because without it, one attributes only the idea, but not the language, which was also "borrowed", see the similar-named thread on Wikipedia talk:Close paraphrasing. --Pgallert (talk) 14:54, 5 November 2010 (UTC)
I suggest we leave morality to our respective churches. Wikipedia has enough on its hands as it is, without trying to make its editors better human beings. Physchim62 (talk) 15:24, 5 November 2010 (UTC)
Well I asked regarding the morality. First of all there's indeed Physchim62's point that we probably should not deal with morality concerns to begin with, but even if we do, I don't see a moral problem here. If we are talking about WP internal rewards(awards and competitions for them, I can see a moral issue with "cheaters". But why should this concern ordinary article work of people who are just compiling free information into WP. There the only concern for me is that the compiled content is legal, correct and verifiable and I see nothing wrong with "plundering" public domain content. Why should we care much, whether an article is closely paraphrased or not as long as its content is legal, correct and verifiable?--Kmhkmh (talk) 16:46, 5 November 2010 (UTC)

+----------------------------------------------------------------------------------------------------+Kmhkmh do you accept that other editors -- from the discussions over the last few days particularly some of those who come from a strong academic background -- do think that this is a moral issue? I think it is best reflected in the section in the guideline called "Why plagiarism is a problem", specifically the last point. So as I see it there are two reasons for endorsing this guideline. First it is a compromise that most editors can live with, and secondly for a relatively minor cost in effort by editors it enhances the reputation of Wikipedia with external parties.

There are many editors who do not see the need for in-line citations -- I am discussing this issued with such a person here and it is because (s)he comes from a tradition (amateur genealogy) where it is not demanded -- but since 2005/06 when newspaper articles were rife with the unreliability of Wikipedia most experienced Wikiepdia editors accept that citing sources is a good idea, not just, or even mainly, for internal disputes, but also for the external credibility and approval of the project. In hindsight I think that this change in emphasis from quantity to quality can be dated to this edit by SlimVirgin in August 2005 when she added "The burden of evidence lies with the editor who has made the edit" to the verifiability policy. Legally we do no have to add citations, we enforce it partly because if the practice had not been introduced and been implemented project wide, the project would have failed (since the widespread adoption of in-line citations the number of press articles on how unreliable Wikipedia is has dropped substantially). I don't just see enforcing anti-plagiarism rules as a morality issue, it is also an issue, like citing sources, that enhances the external credibility of the project, particularly among academic circles and opinion formers in the mass media (who would like such a coat rack to whip up a storm of moral indignation (a raison d'être)). -- PBS (talk) 20:59, 5 November 2010 (UTC)

From my perspective there is a big difference between inline citations (or better proper sourcing) and plagiarism. Proper sourcing (which btw doesn't always have to come in the form of inline citations) is essential to WP's primary goals because they are needed for an efficient verifiability and because we work with anonymous authors, whose reliability we cannot directly access. However I don't see how that holds for plagiarism. If you see plagiarism not as morality question, but just as another rule the community as temporarily adopted or is at least partly in favour of it, then I say it is a rule not directly related to WP's primary goals and that in general we should avoid rules that are not really essential and where the consensus might be questionable (see WP:CREEP). Furthermore I agree with Physchim62 here, that you could argue that this rule does more harm than good.--Kmhkmh (talk) 21:27, 5 November 2010 (UTC)
The importance of moral comes directly from pillar 1.
  1. We're building an encyclopedia, and I would say we're on the right track.
  2. That puts us in a scientific context, no matter how many goblin or Lady Gaga articles are added.
  3. In a scientific context you're just not credible if you do not cite properly.
  4. Close paraphrasing needs to be marked as such, otherwise it is not cited properly, legal issues aside.
  5. We want to be credible, otherwise we're not a good encyclopedia.
I do agree that this guideline should go altogether, but this is only because its content is well covered in OR, SYNTH, and the essay Wikipedia:Close paraphrasing. Cheers, --Pgallert (talk) 11:23, 6 November 2010 (UTC)
I completely agree that WP as an encyclopedia needs to pursue a scientific outlook on the world and the topics it covers, but that has nothing to do wit the plagiarism issues. What the scientific approach requires is verifiability and that requires citation and not necessarily attribution. In fact you can even argue that plagiarism is irrelevant for the science itself. Either content can be verified or not, either some experiment can be replicated or not, that's what the scientific approach requires. Plagiarism doesn't really enter the equation at that point. Plagiarism enters the equation when people claim recognition, awards or (intellectual) "property", so it is essentially a moral argument about (intellectual) ownership and recognition and has nothing to with the science itself. Similarly our credibility as a reliable free source of knowledge primarily rest on verifiability (=citation) and not on attribution. The primary interest of our readers is get free reliable information and in doubt they care very little for details on the attribution or who might or might not deserve any kudos. Having said that, it is of course true that the attribution template is often better than just citation, simply because it describes the actual situation more accurately, i.e. instead of just saying the following reference verifies this piece of content of the WP article it also informs the reader that of the reference's text has been copied as well. However the main and most important part of the information is the former, which can be dealt with by citation only. The latter is nice to have as well, hence definitely should recommend it as well, but imho it is more of marginal improvement, which does not need its own guideline or is it even worth all that fuss.--Kmhkmh (talk) 16:37, 6 November 2010 (UTC)
Essays have no force whatsoever, so I'm afraid that if we do away with this guideline, contributors are free to ignore the essay Wikipedia:Close paraphrasing to their heart's content. --Moonriddengirl (talk) 11:32, 6 November 2010 (UTC)
Unless we promote Wikipedia:Close paraphrasing, of course. That's what I had in mind. I am of course anything but optimistic about the feasibility of such drastic change. --Pgallert (talk) 14:14, 6 November 2010 (UTC)
Whatever is done, we, none of us is in a position ever to judge this in terms of a lack of morality. None of my academic degrees give me that knowledge or power.(olive (talk) 17:18, 6 November 2010 (UTC))

Wikipedia | Make Software, Change the World! | Computer History Museum
src: images.computerhistory.org


Altering the lead

Based on the above and on numerous discussions over Wikipedia for the last few days, it is obvious that there is still considerable confusion about the difference between copyright and plagiarism. I think we need to clarify this guideline to make the difference clear. I hope others will agree. I have modified the lead with a primary purpose of making that difference clear, but with a secondary goal of simplifying it. As this is not an article, we do not need to summarize the entire contents in the lead. As long as we make plain that answers are provided below, I think the lead does not need to restate the guidance the document provides but assert basically what it's doing and what can be expected from it.

I've also added a {{More}} tag to the Wikipedia:Plagiarism#Sources under copyright, while I'm at it. I think the guideline could use some more work throughout, but wanted to get feedback to what I've already done before proceeding. --Moonriddengirl (talk) 12:06, 5 November 2010 (UTC)

Well you know what I think should be done with this pernicious "guideline" ;) Physchim62 (talk) 15:25, 5 November 2010 (UTC)
Yes, I do, and you know I disagree. :) As I've said elsewhere, whatever Wikipedia's stance on plagiarism, we need it written down. There are so many different ways of looking at the issue,and the recent uproar demonstrates that people will use the term as they understand it. Being able to say "We mean it like this" is, in my opinion, a good thing. --Moonriddengirl (talk) 18:47, 5 November 2010 (UTC)

Ian McEwan - Wikipedia
src: upload.wikimedia.org


Plagiarism discussion from Rlevse TALK moved to WP:VP

Discussion

IP comment from mid-thread:

Here [5] is an article attempting to explain why in some nations, like South Korea and Japan there is much scientific fraud happening (anyone remember Hwang Woo-Suk?), whereas in places like Singapore and Taiwan it is much less pervasive. The authors point to an absence of frank discussion and the fact that the culture gives far more clout than is healthy to the successful. Make no mistake, this is a cultural problem. The Facebooking needs to stop, and there must be more discussion, no matter how uncomfortable it will be. 128.226.130.242 (talk) 22:36, 5 November 2010 (UTC) [6]

I am not sure how to translate that article's information to our situation, but I am sure we can learn something from it. Hans Adler 23:20, 5 November 2010 (UTC)

While thinking of sources, you may have noticed the links discussed here, the Office of Research Integrity seemed quite useful though not very detailed. . . dave souza, talk 23:52, 5 November 2010 (UTC)
which links to Guidelines for avoiding plagiarism, self-plagiarism, and questionable writing practices which contains a detailed explanation of why scientists dislike plagiarism and how they should avoid it. -- PBS (talk) 01:24, 6 November 2010 (UTC)

Well WP's primary mission is a free encyclopedia, that is free access to correct knowledge. Why do we need to bother with problems in the academia, that's primary their problems not ours. The fraud that WP needs to be concerned about is forged or (heavily) biased information and not copied information. As far as a cultural component of plagiarism is concerned, that definitely exists (Some information on it can be found here: http://www.rogerclarke.com/SOS/Plag0602.html ).--Kmhkmh (talk) 10:17, 6 November 2010 (UTC)

This has been discussed in the past, both when this was promoted and when it was listed for deletion. Among the arguments advanced, the world at large does care about plagiarism (even if some nations like South Korea and Japan do not). We wind up with stinks like this one, which hit the mainstream press ([7], [8] and [9]) And note Jimbo's statement in this one: "in general we take a very strong anti-plagiarism stance") Our plagiarism stance is actually quite moderate; if it's public domain or compatibly licensed, you can have it, but you have to acknowledge that you copied it. Straightforward and simple. Do we really want to be in the position of having to explain to the media in such cases that we consider plagiarism to be somebody else's problem, not ours? I'm pretty sure that wouldn't do good things for our reputation. :/ --Moonriddengirl (talk) 11:07, 6 November 2010 (UTC)
Well apparently this is an issue coming up over and over again (which could be seen as sign how problematic this guideline is, but i guess you could also see it as a sign that it's needed). As far as our reputation is concerned, of course we are should attention to it, but we are not politicians and we should not adopt measures that may hinder our primary goal. If our stance on plagiarism is basically that one line you write (which is fine with me at least as recommendation), then I have to ask. Why is it not prominently featured in the lead and why does it get blown up to such big guideline? Or even why does this one-liner need its own guideline at all?--Kmhkmh (talk) 11:40, 6 November 2010 (UTC)
This guideline has only been a guideline since the middle of 2009, so it can at worst be faulted for issues since then. :) The guideline is huge because the issue is a hot-button one; people have since it was created tended to bloat it to cover the differences of definition. (I'm not pointing fingers here; I've bloated it plenty myself.) Unlike copyright issues, which are defined by law, plagiarism is more or less up to a community to decide, and the definition of plagiarism varies not only by culture, but even by discipline. We also need to make clear to people that there are cases of copying that are not plagiarism, because the phrases are common coin or what have you. A certain degree of bloat is necessary, since we need to explain how to acknowledge that you've copied it. I believe that we could stand to heavily revise the guideline and sort of launched that above with the lead, but I know that revisions to this guideline are controversial, so I want to go section by section, allowing time for discussion and stability between. --Moonriddengirl (talk) 11:49, 6 November 2010 (UTC)
Despite the title, "Plagiarism by Academics: More Complex Than It Seems", that paper's content has greater relevance to editors working on an encyclopædia than this page's external links. For those in a hurry, the emboldened section is pertinent to what we are doing here. I recognise their authority to provide a balanced overview, so I reckon we should add it.
Regarding ownership and ethics, we should not arguing from a philosophical position (or acknowledge it for what it is) "Property is liberty", [yet], "Property is theft." Proudhon. What is Property? 1840. Anyone reading multiple reference works - text books, encyclopædias, dictionaries - would know that they are often nearly identical, unless they are bias, and full of what would be regarded as plagiarism for someone engaged in OR or gaining an academic qualification. When I find this awkward phrasing, expurgation when it became too tedious to rephrase, and the subsequent dilution of meaning in what Clarke names as 'defensive writing', I think that is more likely to be detrimental to wikipedia than the repetition of what is likely to be found in many other similar works. Considering the extent which the popular media relies on en.wp, or plagiarises it, I'm not concerned at the stories the beat up, or indeed create. cygnis insignis 13:34, 6 November 2010 (UTC)
I agree, well said.--Kmhkmh (talk) 16:41, 6 November 2010 (UTC)
Moonriddengirl, Jimbo has also said that "Even in cases where the attribution is done poorly, as long as there is attribution, there is no plagiarism - just bad style or bad writing." We are setting a higher standard in this guideline, saying that close paraphrasing with attribution is plagiarism, and this standard may simply be unrealistic. If even an arbitrator, bureaucrat, admin, checkuser and long-term editor like Rlevse can fall foul of our plagiarism guideline (though not necessarily our copyright policy), what realistic chance do we have that occasional contributors, including children writing here about their favourite computer games, will comply with it? There is an argument to be made that we should concentrate on compliance with copyright law, and that we should accept that plagiarism is widespread in this project and will never be eradicated, given WP:OR policy. --JN466 15:00, 6 November 2010 (UTC)
Here is Rlevse's source. Here is Rlevse's edit. Now, was this a clear, deliberate and/or actionable copyright violation? If so, fair enough. But if it wasn't, is it worth the human cost to the project to insist on the enforcement of plagiarism standards that were designed for original research in the media and academia, i.e. a completely different context than the one we have here, where we are not expected to do original research? --JN466 15:25, 6 November 2010 (UTC)
I don't know how Jimbo meant it, but in my opinion a citation instead of a clear notice that text was copied is not "attribution [...] done poorly", it is no attribution at all. Jimbo's comment was a response to Wnt's comment about "wholesale plagiarism of the 1911 Encyclopedia Britannica, done without consent, with skimpy attribution, and only legal because the U.S. declines to enforce perpetual copyright". As far as I can tell that "skimpy attribution" probably referred to the boilerplate "Initial text from 1911 encyclopedia", which John Vandenberg later claimed is not enough because it doesn't mention the author.
Rlevse's edit is a clear copyvio. He gave the correct source, but he didn't attribute it, i.e. he didn't write something like "This text is derived from a USA Today article." If he had done that it would immediately have been spotted as a copyvio and would never have made it on the main page. Hans Adler 15:50, 6 November 2010 (UTC)
Well, he gave the wrong source in that edit, actually (carolshouse.com), in what I believe was an honest mistake (which he later fixed). Neither of us are copyright lawyers, and I would prefer the judgment of a professional. Even if the passages that Rlevse used unchanged do amount to a copyvio, we are presently classifying close paraphrasing with attribution as plagiarism as well, and I am not clear that close paraphrasing with attribution is a copyright violation. According to Jimbo, close paraphrasing with attribution is not plagiarism, as there is no intent to pass off material as one's own. If there is an inline citation, that tells the reader where you got the material from. --JN466 16:02, 6 November 2010 (UTC)
I agree--Kmhkmh (talk) 16:40, 6 November 2010 (UTC)
Also, as far as copyright violations are concerned, there is a tendency in Wikipedia to pretend that this is a very simple and straightforward matter: "He's copied a sentence verbatim, so therefore it's a copyright violation." It is time we lost this Mickey Mouse understanding of copyright, which seems to be partly motivated by the enjoyment of being able to wave a stick at another editor, because in the real world, this is actually a little more complicated. In determining whether or not a use of a copyrighted work is a copyright infringement or not, a number of factors come into play:
  1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
  2. the nature of the copyrighted work;
  3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
  4. the effect of the use upon the potential market for or value of the copyrighted work.
Looking at this specific case: 1. Wikipedia is a nonprofit educational work; 2. the copyrighted work was a newspaper article from several years ago, which is not for sale, but openly available online - a non-artistic work which recounted historical facts; 3. the amount and substantiality of the portions used verbatim was minor; and 4. the effect on the potential market or value of the copyrighted work was if anything positive, the citation directing Wikipedia readers to the article, increasing its page views and thus adding to the advertising revenue that USA Today will have received from the page. Hence I am by no means certain that this was a copyright violation in the eyes of the law of the real world, out there. --JN466 16:52, 6 November 2010 (UTC)
The question of what constitutes plagiarism is strictly up to the determination of the community; there is no law anywhere that identifies it. The Wikipedia community can determine to hold the strictest academic standards or none at all (though if we go with none at all, we will have to deal with the fall-out of that decision as we do with all). What matters most to me is that we are clear on our expectations, whatever they may be, so that we avoid confusion and uproar around issues that distracts from the real purpose here. In terms of our treatment of copyright is a straightfoward matter, we really have no choice. Copyright is very complicated in the real world, and determinations of when material violates exclusive rights is made by the courts. Lawyers will fight to the wire over their own interpretations.

In terms of the current case, I like User:Rlevse, and I'm very sorry that this whole incident has happened. I'm even more sorry that he chose to depart; he did a lot of good for the project. The edit in question here may or may not have constituted copyright violations; in the absence of a court determination, we'll never know. But it was, unfortunately, a clear violation of our copyright policies. We have certain structures in place when importing non-free content. Wikipedia's policies and guidelines are, according to Wikipedia:Policies and guidelines, "developed by the community to describe best practice, clarify principles, resolve conflicts, and otherwise further our goal of creating a free, reliable encyclopedia." Making sure that our content is well within fair use both for us and for the reusers who are part of our mission seems like a very good thing to me; by the time a copyright holder sends us a take-down, those who trust that our content is free may have already published it on their own websites or even in their print publications. (You know, of course, that Wikipedia's nonprofit status has no bearing; that's why we don't accept content that is licensed for non-commercial use only. I'll leave your presumptions about the pleasures of stick-waving alone.) --Moonriddengirl (talk) 17:26, 6 November 2010 (UTC)
If something is in copyright we can't copy it except to the extent explicitly permitted. As an international project that explicitly allows and encourages reuse of our text, we cannot rely on any individual country's exceptions. We can only rely in a general way on what the Berne Convention allows: wikisource:Convention for the Protection of Literary and Artistic Works/Articles 1 to 21#Article 10. Please not what is required there:
  • "[...] permissible to make quotations [...] provided that their making is compatible with fair practice, and their extent does not exceed that justified by the purpose, including quotations from newspaper articles and periodicals in the form of press summaries."
  • "Where use is made of works in accordance with the preceding paragraphs of this Article, mention shall be made of the source, and of the name of the author if it appears thereon."
This is essentially the fair use exception. Although it is not made explicit but only hinted at with "fair practice", it seems clear that it requires:
  • Mark the copied passage clearly so that everybody can see where it comes from.
  • Make it absolutely clear that the text is copied from the source, not just the ideas taken from there.
  • Do not change the text.
  • Do not slap a GFDL or CC-by-SA on the copied content to encourage others to copy it further.
Whether we are making money from it is only relevant to a fair use claim, but as you can see that simply doesn't apply. Items 6 to 8 here are also relevant. This gives a good overview of what you can legally do with someone else's text in the UK. By all criteria, Rlevse's edit was a clear copyvio as a derivative work that was not fair use. Hans Adler 17:32, 6 November 2010 (UTC)
Fair use is enshrined in the Berne Convention. Everything after "it seems clear" above appears to be your personal opinion, when in fact it is subject to legislation in the individual member states. CC-by-SA includes an explicit disclaimer saying that noninfringement is not guaranteed, and that the onus remains on the reuser. And frankly, my heart bleeds if VDM Publishing have to do a little bit of checking before they sell your and my work for $50 a piece. (I think they have 50,000 They have close to 150,000 books on the market so far, all based on Wikipedia.) Different countries have different laws (obscenity in Saudi Arabia, anyone?); there is no intent to ensure that all our material is legal in every country of the world, and the onus is on reusers to check compliance with legal requirements in their country. Always has been. Wikipedia is governed by US law. --JN466 19:28, 6 November 2010 (UTC)
This is not the place to change copyright policy, but, for the record, while your heart may not bleed for our reusers, my heart is sometimes engaged by the copyright holders. Just last week, I deleted content from two articles that had been here for years after the copyright holders wrote us to complain. If you're bitter about VDM Publishing selling your work (you may not be; but you sound that way), I wonder how they feel about Wikipedia's contributors giving it to them. --Moonriddengirl (talk) 19:41, 6 November 2010 (UTC)
MRG, I am always with you when it comes to respecting our sources' copyright, and you're doing a wonderful job. But I also want us to be sure that we don't clobber Wikipedians for copyright violations that aren't copyright violations in the real world. --JN466 19:50, 6 November 2010 (UTC)
I know you're always good with copyvios. But I think we need to be careful; we need to make sure that we don't relax our copyright policies to the point that we start letting borderline content in. If we do, it's a simple matter for us to remove it on a take-down, but not a simple matter for reusers...who can do considerable inadvertent damage to copyright holders just be taking our invitation to use their content. I think what we need to work on here is simply making sure that we remember what WP:AGFC is all about: not everybody who violates our copyright policies (which is not automatically the same thing as violating copyright law) is intending to do any harm, and until we know otherwise, we should treat them with respect and civility. (If we know otherwise, we should still treat them with civility, though I reserve the right not to respect them. :)) --Moonriddengirl (talk) 19:56, 6 November 2010 (UTC)
(edit conflict) What I will say about VDM Publishing is that if protecting downstream users of our content boils down to enabling this business, then perhaps we should rethink our strategy. Wikipedia is freely available online around most of the world. Why do we need downstream commercial re-publishers of our content? What worthwhile downstream commercial reusers are there? --JN466 20:01, 6 November 2010 (UTC)

To be clear about my view: I don't think the edit under discussion here is a copyright violation, although I wouldn't want to argue that point very strongly, as I think reasonable people can differ on that. But I really don't think it is plagiarism, because it was attributed. Having said that, it was excessive paraphrasing and not attributed in a way that I think reflects best practice. So while I disagree with Hans Adler's view that this is plagiarism, I very very strongly agree with him that it is a bad edit and well beneath what we should shoot for. I'm just reluctant to overuse the word 'plagiarism' for every sort of inadequate writing involving poor attribution practices.--Jimbo Wales (talk) 16:59, 6 November 2010 (UTC)

I agree that the term plagiarism is a very loaded one. This guideline used to specifically address some of the diversity of definitions in this section, but even in its current formation the guideline says, "Given that attribution errors may be inadvertent, intentional plagiarism should not be presumed in the absence of strong evidence. Remember that contributors may not be familiar with the concept of plagiarism or that their definition may differ from that adopted by Wikipedia" (footnote omitted). --Moonriddengirl (talk) 17:26, 6 November 2010 (UTC)
For plagiarism it's not necessary, but for copyvios we should really have a policy that was vetted by an international copyright lawyer, which can then play a role similar to BLP. I just can't see how that edit can be seen as anything but a copyvio. Fair use only applies when you copy something literally, not for changing it and incorporating it into your own text. And the length of the passage makes it clear beyond any doubt that it was created as a derivative work. I can see how you may get away with it in court, but that's an entirely different matter which an individual editor can consider privately to base a single decision on, but which cannot inform our general practice as a community. Hans Adler 17:40, 6 November 2010 (UTC)
One problem we have is that so many of the judgement calls we have to make have never been tested before any court: we are literally making up much of our copyvio policy as we go along, because we have no other option if we're going to do what we want to do! To give a concrete example for the copyright wonks, is an SVG image protected as an artistic work or as a computer program? The distinction is very important in European copyright law (less so in the U.S.) I don't think we're ever going to be able to afford an international copyright lawyer to take on the responsibility for our entire copyright policy. What we can do is accept that we are a valid part of the interpretation of copyright law in general (as opposed to any specific case which might involve WP). Hans or MRG giving a presentation at a conference on copyright law? Why not? Physchim62 (talk) 18:00, 6 November 2010 (UTC)
I agree with Hans that we have had enough laypeople pronouncing on copyright. And if for the purposes of obscenity we are governed by US law, I don't see why copyright should be any different. --JN466 19:38, 6 November 2010 (UTC)
Well, "according to Jimbo Wales, the co-founder of Wikipedia, Wikipedia contributors should respect the copyright law of other nations, even if these do not have official copyright relations with the United States.[10]" (WP:C). That said, we are governed by US laws. --Moonriddengirl (talk) 19:45, 6 November 2010 (UTC)
Thanks for clarifying that. --JN466 20:01, 6 November 2010 (UTC)
The other point to remember is that each individual editor is constrained by his or her own local law. As a Spanish resident, I cannot do anything which is in contravention of Spanish law, even if it would be legal in the State of Florida. I've never found that to be a great practical problem, but it's something to bear in mind when we're discussing the "philosophy" behind WP policies. Physchim62 (talk) 20:16, 6 November 2010 (UTC)
I'd have to disagree with Jimbo on whether the edit we are discussing is plagiarism. When I contribute text to Wikipedia, I am releasing my contribution under the CC-BY-SA licence, as my work, either original or derived. Since much of what I contribute contains citations (for verification) - and yet these are my own unattributed contributions, released by me under CC-BY-SA - it is clear that citation does not equate to attribution. We have the technique of quotation for the purposes of attribution, and that should be used when any uncertainty exists. If I were to write an amount of text that is identical or substantially the same as a source, and did not make it clear that the text I contribute was essentially someone else's work, then I would be claiming it as my own by default - and that is what this community recognises as plagiarism. --RexxS (talk) 21:18, 6 November 2010 (UTC)
  • There is a useful article here by Irving Hexham that argues that plagiarism standards for textbooks and encyclopedia articles need to be quite different from those that apply in other academic contexts. It is not appropriate for us to work with definitions of plagiarism that were conceived for a different context than the one we are working in, i.e. for contexts demanding original writing vs. contexts like ours, where the task is summarising existing knowledge, and there is no pretense at presenting original research:
  • "5. Discussion and caution: In judging that an author plagiarizes great care must be taken to ensure that careless mistakes, printing errors, inexperience, and even editorial changes made by a press are not used as accusations against an innocent person. Further, it is necessary to recognize "common usage" and the nature of the writing itself. For example many basic textbooks contain passages that come very close to plagiarism. So too do dictionaries and encyclopedia articles. In most of these cases the charge of plagiarism would be unjust because there are a limited number of way in which basic information can be conveyed in introductory textbooks and very short articles that require the author to comment on well known issues and events like the outbreak of the French Revolution, or the conversion of St. Augustine, or the philosophical definition of justice. Further, in the case of some textbooks, dictionaries, newspaper articles and similar types of work both space and the demands of editors do not allow the full acknowledgment of sources or the use of academic style references. It should also be noted that many more popular short pieces, like oral lectures, are produced from old notes and memory. Professors often don't know from where they got a particular definition or description of a well-known figure or event. As long as such writing deals with things that are essentially public domain, even though at times specific wordings may be very similar indeed, this is not plagiarism because it does not involve deliberate fraud. For example, it is almost impossible to describe the origins of something like the Watergate Affair in 300 words without using almost identical words to anyone else that attempts to describe the same event. The intent of the writer should [be] the key issue in recognizing plagiarism. For example in the early years of this century the best-selling German author, Karl May (1842-1912) was accused of plagiarism because his adventure stories contained descriptions of landscapes and urban settings which were clearly culled from travel books. May did not deny this. He simply argued that to judge his works as plagiarized because he borrowed geographic descriptions in which to set his stories was to totally misunderstand the function of the storyteller. Someone spinning a yarn may borrow freely if they reuse the original material in such a way that the final product is not dependent on what has been borrowed to create the setting. It is therefore seems necessary to distinguish between academic and other types of writing and to ask what is the reader led to believe an author is doing. If a book or thesis contains academic footnotes, is written in an academic style, and is presented as a work of original scholarship, then it must be judged as such and measured against the accepted rules for citation found in sources such as The Chicago Manual of Style."
  • What we are doing is not "presented as a work of original scholarship"; quite the contrary. There may well be people who see things different from Hexham, but if we want to consider how plagiarism applies to us, we should look at sources that address our specific context. --JN466 04:18, 7 November 2010 (UTC)
Thank-you for finding that article. It contains several important points that should be made in the guideline if this has not already been done. To add my bit to the general discussion, as far as the section of this guideline dealing with allegations of plagiarism goes, I think the important points are that the person making any allegations needs to approach the editor directly (and name specific articles and specific problems, rather than making vague accusations) before raising the matter elsewhere (there is too much of a tendency for rumour and innuendo to circulate off-wiki), and that editors whom such allegations are made against should make efforts to deal with specific problems with named articles, rather than walking away from the problems raised. For full disclosure, I should say here that I was one of those who off-wiki allegations were made against earlier in the year, to which I responded indirectly with this statement (I added this diff to my user page and user talk page as well for several months, and only removed it recently). No-one responded to that, and I only ended up spot-checking my past contributions (rather than reviewing the entirety of my past contributions) but I am always available to discuss on-wiki any concerns raised with any of my editing. Just as some people pledge to not retire dramatically, I think people should pledge to deal with such matters when clearly pointed out, rather than walk away from the issue. Carcharoth (talk) 14:41, 7 November 2010 (UTC)
This view of plagiarism has been discussed in the past on Wikipedia; however, the handling of content in the guideline currently reflects community consensus on what our standards should be. Consensus would need to be gathered to alter the core approach to handling previously published content. --Moonriddengirl (talk) 14:58, 7 November 2010 (UTC)
Is that a response to Jayen466 or to me? For the record, I agree with the approach currently outlined here, but one aspect that is omitted there is what to do if off-wiki allegations are made (on internet fora and in chat channels). I suspect the best answer is to ignore such rumours and gossip unless approached directly, but some editors will (understandably) want to respond to such matters without being dragged into off-wiki arguments, and possibly some advice could be given to editors faced with this sort of situation? Carcharoth (talk) 16:21, 7 November 2010 (UTC)
Generally to the concept; not specifically to anybody. :) I threaded under you just to clarify where your comment ended and mine began. I didn't weigh in on the question of how to notify because I agree. I think it's pretty much already in line with Wikipedia:Plagiarism#Addressing the editor involved. It doesn't suggest specific examples, but I think it's suggested in "In addition to requesting repair of the first instance, you may wish to invite the editor to identify and repair any other instances of plagiarism they may have placed prior to becoming familiar with our guideline." --Moonriddengirl (talk) 17:12, 7 November 2010 (UTC)
What if the editor in question has a lot of edits? (This is a serious question, as asking a contributor with lots of edits to review all of them is something I would be hesitant to do, having tried and failed to do the same for myself). Is spot-checking acceptable along with the assumption that anything missed will be picked up in the course of normal editing of the articles to which contributions were made? Carcharoth (talk) 18:06, 7 November 2010 (UTC)
That's a problem, and I can't answer it. I've reviewed complete contributions for lots of people (though we're far behind :/), and I know how time consuming it is. I would not myself undertake a CCI-scale review for plagiarism issues alone, though I've put plenty of hours into that for copyright concerns. However, we do have the CCI program that we can use to list a contributor's edits if that proves helpful. I guess if there were huge issues, the contributor could do so him or herself as a gesture of good faith. --Moonriddengirl (talk) 18:44, 7 November 2010 (UTC)
  • Risker made an interesting post today at ANI: [11]. If what she says is true -- and I believe it is -- a large part of this community regularly breaks parts of this guideline related to close paraphrasing, which makes me question the notion that the guideline reflects community consensus. It may be that both the guideline and community behaviour need to change. As Moonriddengirl said, above, the prevention of plagiarism -- unlike copyright -- is not a legal requirement, it is a matter for community consensus. It is implausible that community consensus is to criminalise what Risker called "standard editorial practice" in this project. --JN466 03:57, 8 November 2010 (UTC)
  • SlimVirgin has added some useful material to the guideline text. [12] It's a step in the right direction. I've made some copyedits. --JN466 08:54, 8 November 2010 (UTC)
Thank you both. These edits are a huge step forward. Hans Adler 10:15, 8 November 2010 (UTC)
"It is implausible that community consensus is to criminalise what Risker called "standard editorial practice" in this project." If you look at this guideline as of 8 October (as stable version) and read the section How to respond to plagiarism. This guideline did not criminalise anyone. Also see my comments higher up this page on 4 November. -- PBS (talk) 10:47, 8 November 2010 (UTC)
Since copyright violation is a civil offence, not under criminal law, presumably the aim is to civilise standard editorial practice. The guideline looks good, we all benefit from review and careful consideration of how to avoid copyright violation, and how to produce the best articles while avoiding unacceptable plagiarism, even if it's legal and there are genuine differences of opinion about the extent to which using the same phrases is acceptable. . . dave souza, talk 12:36, 8 November 2010 (UTC)
But there is a big difference between a recommendation for best practices or good articles and a "mandatory" guideline for all articles or all writing in WP. The for the latter we should not mandate things that do not have a community consensus.--Kmhkmh (talk) 01:37, 9 November 2010 (UTC)
  • A big "hear hear" to the new section. And, JN, I think this is an important reminder. Thank you. :) --Moonriddengirl (talk) 12:49, 8 November 2010 (UTC)
    • Pleasure. Believe me, I am never trying to cause you headaches on purpose. :/ --JN466 04:21, 9 November 2010 (UTC)

Jordan Brandman Copied Wikipedia to 'Earn' $24,000 â€
src: www.orangejuiceblog.com


Midsection rearrangement

  • My changes: [13]

Much of what was under "Definition" was not definition. In an effort to achieve better clarity and to cut down redundancy, I have restructured this content. I moved one passage into "Why plagiarism is a problem", and I created a separate section entitled "How to avoid plagiarism disputes". I also restructured the section titled "What is not plagiarism". I moved some content to footnotes. (I'm of the opinion that footnote #4 is far too detailed and complex, but didn't want to complicate this by removing the content.) I removed some redundancy about WP:V and WP:NOR, but made plain in the lead text of that section that these recommendations deal with plagiarism only, and not with V or OR. --Moonriddengirl (talk) 12:07, 6 November 2010 (UTC)


Karl-Theodor zu Guttenberg - Wikipedia
src: upload.wikimedia.org


Major revision: "Attributing text copied from other sources"

  • My changes: [14]

I have undertaken a major revision of this section, primarily to focus the guideline on the specific function of the guideline (plagiarism), but also to reduce redundancy. These are the changes I have made and why:

  • I have tried to clarify the lead section. I have both strengthened the caution there against copyright problems (in cases where contributors do not know whether content is free or not) by quoting from WP:C but also reduced the redundancy between the bullet formerly marked "Copyright restrictions." and the section "Sources under copyright".
  • I have moved content from "Sources under copyright" into a new section called "Close paraphrasing", because this does not apply only to copyrighted content. I also moved content from that section into a footnote, here. This also does not pertain only to sources under copyright.
  • I have simplied "Sources under copyleft" to focus more tightly on plagiarism. Some of the content was redundant to the lead, particularly the "when in doubt" advice.
  • I have merged the "Copyright expired" and "public domain sources" section, as content that is copyright expired is public domain.
  • I have removed the "Generating articles from other sources" section, as it does not pertain to plagiarism beyond what is already covered in the sections above.
  • I have restructed "Copying within Wikipedia" to focus on the plagiarism issue. What was there previously did not pertain to plagiarism at all, but rather to meeting licensing requirements.

I hope that others will agree that this is an improvement and that we can discuss details of further development. :) --Moonriddengirl (talk) 14:07, 7 November 2010 (UTC)


Jordan Brandman Copied Wikipedia to 'Earn' $24,000 â€
src: www.orangejuiceblog.com


Close paraphrasing as copyvio

While this guideline deals sensibly with close paraphrasing, if it's too extensive it can become a copyright problem. This doesn't seem to be covered in the current version of Wikipedia:Copyright violations policy. I've proposed an amendment to that policy on its talk page, and would welcome review before editing the policy itself. . . dave souza, talk 17:15, 8 November 2010 (UTC)


Nick Simmons - Wikipedia
src: upload.wikimedia.org


In-text attribution without quotation marks is insufficient

The following is a quotation from a New York Times article: "most teenage girls would not be caught dead dancing with their dads."[15] Naively, this information might be represented in an article as "Young women generally regard the prospect of participation in father-daughter dances with disgust.[16]" It's apparent from the context that this claim is an opinion by the author of the article, not a statement of fact for which the New York Times is a reliable source. Thus, an unqualified representation would violate WP:NPOV by suggesting that the NYT article had actually said something like "90% of teenage girls surveyed in [name of study] would not be caught dead dancing with their dads." Neutrality requires in-text attribution of such an off-the-cuff opinion: "According to New York Times reporter Neela Banerjee, young women generally regard the prospect of participation in father-daughter dances with disgust.[17]" Note that in this case, such attribution does not, and should not be construed as, indicating the direct copying of text or close paraphrase. Many articles utilize this form of in-text attribution for the sole purpose of crediting and qualifying ideas and opinion, not indicating verbatim quotation, or paraphrase approaching it. Now, if in-text attribution without quotation marks were actually sufficient to denote the borrowing of text, we could write "According to New York Times reporter Neela Banerjee, most teenage girls would not be caught dead dancing with their dads.[18]" This is a deficient practice, because the attribution provided seems only to credit an opinion, and does not warn readers that the material which follows is directly copied non-free text, which may not be reused under the terms of the Creative Commons license, but only as permitted under fair use law. To prevent the misrepresentation of copyright status, I suggest the removal of the in-text attribution without quotation marks option. (While it's true that plagiarism and copyright violations are not identical concepts, plagiarism of a non-free text is usually considered a copyright violation per se -- how can we claim fair use of text if we're not even willing to provide proper credit to its authors -- and is misleading in any case, since readers are entitled to presume that material which is not enclosed in quotation marks or otherwise indicated as a direct quotation is free content.) Peter Karlsen (talk) 19:36, 8 November 2010 (UTC)

I've gone ahead and removed the offending material [19]. In a free content project such as Wikipedia, avoidance of plagiarism through verbatim text copying and close paraphrase serves primarily to avoid the commingling free and non-free text, warn that non-free text is unavailable under CC-BY-SA 3.0, render fair use claims for non-free text defensible, and make copyright violations, such as grossly excessive quotations, blatantly obvious. An in-text attribution without quotation marks method that Wikipedia utilizes for the purpose of crediting ideas and opinion, where no copying of text or close paraphrase has occurred, isn't acceptable for our needs. Peter Karlsen (talk) 22:47, 8 November 2010 (UTC)

I don't know but after reading this comment, I have a feeling that things get a bit out of hand. We need to provide a framework regarding our licencing models and quotation and writing guidelines, that allows regular authors to write somewhat "normally" and we need (as Hans Adler suggested elsewhere) a copyright lawyer to help us making sure that our guideline is as free as possible but still reasonably safe from legal perspective. In particular any attribution/paraphrasing/quotation style that is common practice and/or regarded as (legally) sufficient in journals/magainzes/books/newspapers needs to be possible in WP as well. We should not (and imho cannot) have a rather WP-specific use of quotations marks, indirect speech or close paraphrasing, such an approach is totally impractical for authors, in particular for the rather global and heterogenous set of authors that en.wp has.--Kmhkmh (talk) 23:38, 8 November 2010 (UTC)

My experience in academia is that in-text attribution without quotation marks isn't an acceptable method to indicate the direct copying of text. I gather that some other contributors believe that it is. But ultimately, the most important issue is the correct standard for Wikipedia. Almost all journals/magazines/books/newspapers are not free content projects. Similarly, most academic papers are written for the benefit of the students creating them, and will be read by few people other than their professors (doctoral dissertations being an obvious exception.) Wikipedia cannot avoid addressing its own unique considerations: WP:NPOV often requires in-text attribution to avoid stating opinions and disputed views as bare facts. It would be ill-advised to ambiguate this method of attribution by simultaneously using it to note the verbatim inclusion of non-free text. Issues of practicality in informing editors of our specific requirements can be addressed by appropriate administrative action: a link to this policy could be added to MediaWiki:Edittools. Peter Karlsen (talk) 00:15, 9 November 2010 (UTC)
Well this is not academia but WP and it is the first time that I've heard that indirect speech would not possible. One of the unique requirements of WP is exactly to have as little formal overhead for authors as possible. We're not running a bureaucracy with specific rules to avoid any conceivable ambiguity in an article.--Kmhkmh (talk) 01:09, 9 November 2010 (UTC)
Indirect speech with some form of in-text attribution is fine, Peter (both on WP and in other publications), so long as it's not too extensive, and the page makes that clear. We can't encourage the quote farm mentality. SlimVirgin talk|contribs 00:29, 9 November 2010 (UTC)
I share your distaste for quote farms. In my view, the appropriate remedy for such problems is comprehensively rewriting source material in one's own words, while substantially retaining its meaning. Our understanding of WP:NOR must give sufficient breathing room for this practice, by not condemning all original forms of expression: only original ideas are forbidden. Removing quotation marks is a cosmetic solution to the more fundamental problem of excessive quotation. Used judiciously, quotation marks, <blockquote>, or similar methods of indicating the inclusion of verbatim text from a source do not cause unacceptable stylistic damage to articles. In a free content project, it is imperative that we minimize the use of non-free text, and properly warn readers where it is present. Because in-text attribution is used to correctly credit opinions for WP:NPOV reasons even when no text is copied or close paraphrasing occurs, this form of attribution is ambiguous, does not provide adequate warning that text is non-free, and encourages the silent formation of quote farms which include far more non-free content than we should have, or is acceptable under WP:NFCC. It is far easier to rid articles of quote farms when they appear as such. Peter Karlsen (talk) 00:46, 9 November 2010 (UTC)
Sometimes people might want to use the precise words without quotation marks, or something very close to them to convey the same thrust. That kind of thing boils down to editorial judgment (and I've never known it to be a problem in academia). The only thing that's required with quotations or very close paraphrasing is a signal in the text--not just in a footnote-- that these words are not your own. But you wouldn't even have to name the source necessarily. For example, "As one writer put it, Susan Boyle's meteoric rise was a triumph for women of a certain age over an otherwise dismissive youth culture." That leaves open whether I'm using that writer's precise words, or a very close paraphrase, and with a sentence like this it doesn't matter, because nothing hangs on it.
Whether to add the name of the writer in-text would depend on a variety of factors, such as whether the sentence is in the lead (in which case it might be best without a name), or whether the writer is well known (in which case it's best with a name, including in the lead), or how distinctive the phrase is (if very distinctive, then probably best with a name). The important thing is to signal in-text that some or all of the words are not your own, and to add the source to the next footnote. SlimVirgin talk|contribs 01:16, 9 November 2010 (UTC)
Phrases such as "as one writer put it" are fairly obvious indications that text has been copied verbatim or closely paraphrased. I'm sure that you understand how to use in-text attribution so that copying really is apparent. However, academically inexperienced contributors (for whom this guideline is intended to be most instructive) will use other forms of in-text attribution, such as "according to", or "x claims that", which creates ambiguity as to whether
  1. We are using the precise words of a source.
  2. We are closely paraphrasing.
  3. We are using information from a source, but our own original expression, and attributing an opinion to the person(s) holding it, as WP:NPOV requires.
If the guideline advises editors to utilize in-text attribution without quotation marks at all, then it is imperative to provide clear and concise guidance regarding which wordings do and do not unambiguously call attention to the presence of non-free text. Peter Karlsen (talk) 01:52, 9 November 2010 (UTC)
As long as we stay legal we don't have to minimize anything in particular not if it turns into an obstacle for authors. I'm not sure how useful the "warning" of readers is anyhow, most people simply read our article for the information and care very little for internal and somewhat technical differences in licensing. Moreover I don't quite see the use of the splitting the text in "fair use" and "CC" in practical terms anyhow, at least in practice you cannot remove the fair use part and just use the CC part. You might do that with pictures, but you cannot do that for texts within an article, as the resulting text will often be unreadable or impossible to understand.
Also I'd like to point out again, that the emphasis should not be on various format and template options to get around, but that it should be as easy as possible for author to contribute and requiring as little internal WP knowledge as possible. So the goal here is find the maximal degree of freedom, that still keeps us reasonable safe from legal challenges and not a minimal degree to be on the super super safe side.--Kmhkmh (talk) 01:30, 9 November 2010 (UTC)
There are many practices that would not interfere with staying legal, but still do not comport with free content ideals. For instance, as Wikipedia is not a for-profit endeavor, images licensed only for non-commercial use could legally be included in articles without restriction. However, the image use policy expressly forbids such a practice, and treats non-commercial only images in the same manner as any other possibly fair use content: they are acceptable only as allowed in WP:NFCC. Similarly, our use of non-free text should not create headaches for downstream reusers. While the interests of Wikipedia scraper sites are of little importance, some people do redistribute our content in valuable ways: they might import it into other online reference projects, then modify it to suit their editorial goals, distribute copies of an article to a class (after correcting any errors), etc. Removal of non-free text, while producing a useful result, is always possible: it merely requires that the non-free material be rewritten in original language. Since some worthwhile content users will have legitimate reasons for effectuating such removals, it is quite important to clearly mark which content is free, and which isn't. Peter Karlsen (talk) 02:18, 9 November 2010 (UTC)
As I said above, the separation of fair use and CC texts is impractical and an unreasonable burden for authors contrary to the image case. The reuse of our material by others is not our primary concern, our primary concern is to provide a free encyclopedia and for that it is essential that authors can contribute easily.--Kmhkmh (talk) 02:26, 9 November 2010 (UTC)
(ec) I agree with Kmhkmh that it's important not to create an obstacle course here. And this page is about plagiarism, not non-free content or copyright, which are related but separable issues. WP is already an obstacle course for new editors (and not only new ones). The aim is provide commonsense advice to keep editors on the side of the angels, but without making people so nervous that they turn their articles into a list of quotes. SlimVirgin talk|contribs 02:28, 9 November 2010 (UTC)
The most important factor in producing excessive quotation is a rigid interpretation of WP:NOR, which chokes off any original language. People who recognize that it's okay to rewrite material in their own words if the meaning of the reference(s) is retained are unlikely to resort to quote farming. Peter Karlsen (talk) 02:45, 9 November 2010 (UTC)
Peter, there has been a lot of discussion about this in various places over the last few days, and I worry that your edits are taking the guideline in the opposite direction from what seems to be the emerging consensus. The aim is to keep things clear and simple for people.
This, for example, is something I've not seen before, and I don't agree that it's correct: "Additionally, in-text attribution such as "according to", or "x claims that" is not sufficient to indicate the copying of text, since such phrases are frequently used for the sole purpose of crediting claims or opinions to those asserting them." We can't introduce idiosyncratic distinctions that may not be widely recognized or helpful. SlimVirgin talk|contribs 02:47, 9 November 2010 (UTC)
As the song says "Don't worry be happy": "and I worry that your edits are taking the guideline in the opposite direction from what seems to be the emerging consensus" That is really elegant rhetoric SV (I must remember it so I can close paraphrase it in future 8-) ), but I see your changes you have been making moving away from the established consensus, not the other way around. I don't think that the issues raised in what is now the first section on this talk page "#Phrases 'may' require quotation marks" has been fully discussed. -- PBS (talk) 03:24, 9 November 2010 (UTC)
Correct in-text attribution on Wikipedia (if such a thing is possible) can be explained with neither clarity nor simplicity. Consequently, there was a consensus on this talk page and at WT:NFCC that we should avoid it altogether, and mandate quotation marks, <blockquote>, or a similar method, together with an appropriate reference citation. Nonetheless, if such attribution is allowed, then it needs to be done in a way which unambiguously delineates free and non-free content, a key concern in the discussion at WT:NFCC. Peter Karlsen (talk) 03:16, 9 November 2010 (UTC)
The new section I added recently gained consensus here and in a few discussions elsewhere, and I tried to write it in a measured and balanced way, so I'd really appreciate it if changes to it could be discussed here first. Adding more details, or more exceptions, or more rules is adding confusion to what I think is a clear and helpful piece of guidance.
Peter, this page is not about non-free content. It's about the separable issue of plagiarism. SlimVirgin talk|contribs 03:33, 9 November 2010 (UTC)
Plagiarism is associated with copyright violations and misuse of non-free content on Wikipedia, which is the primary reason we're so concerned with it. If excessive non-free material has been added to an article in a way that doesn't constitute plagiarism, the problem will be obvious, and easily remedied. This guideline should act as a barrier to the silent accumulation of copyright violations, possible only because the non-free content was not correctly marked. This is why editors at WT:NFCC so strongly opposed your changes to the policy. This guideline shouldn't mislead new contributors into citation practices that won't pass muster under WP:NFCC. It would be confusing indeed if the non-free content policy imposed separate, additional requirements concerning the manner in which the copying of text is indicated, a subject which is treated exhaustively here. Peter Karlsen (talk) 04:05, 9 November 2010 (UTC)
SV as you have in you own words added it recently [sic] and someone is editing it, I think it is premature to claim the the edit with no changes has consensus, particularly when another editor has been making changes to it which to date you are the only one to have reverted. I think the sentences "But be cautious when using it, because it can lead to other problems. For example, "According to Professor Susan Jones, human-caused increases in atmospheric carbon dioxide have led to global warming" would be a violation of NPOV, because this is the consensus of many scientists, not only a claim by Jones." If elucidation of this point is needed, I think it would be better made along the lines of the note already in the text in another section "Note: works copied into Wikipedia ..." -- PBS (talk) 04:08, 9 November 2010 (UTC)
The example you would not be violation of NPOV but rather citing or paraphrasing the source correctly. Since the source only is mentioning a scientific consensus, hence you cannot use it as citation for such claim. You might argue that your example creates a bias conceiving that there is in fact exists somewhat of a scientific consensus, but whether that's the case or not can only be judged by the context of the whole article and not just by this single line.--Kmhkmh (talk) 11:37, 9 November 2010 (UTC)
I think you have misunderstood. The sentences in italics are not my example they are currently in the guideline and I think they should be removed and replaced. -- PBS (talk) 19:04, 9 November 2010 (UTC)

We are saying in Wikipedia:Plagiarism#How_to_avoid_inadvertent_plagiarism:

  • indirect speech--copying a source's words, or closely paraphrasing, without quotation marks; this also requires in-text attribution and an inline citation. For example:
  • John Smith wrote in The Times that Cottage Cheese for Beginners was the most boring book he had ever read.[4]
Note: even with in-text attribution, distinctive words or phrases may require quotation marks.

Does the last line of this, about "distinctive words or phrases", not address the disputed issue? I would tend to agree with SV that writing -

  • John Smith wrote in The Times that Cottage Cheese for Beginners was the "most boring book [he had] ever read"

is kind of naff, and I doubt that the reusability of Wikipedia would be compromised by the absence of quotation marks here.

When it comes to a phrase like "most teenage girls would not be caught dead dancing with their dads", that is perfectly idiomatic, but has a little more pizzazz, so I personally would be tempted to place it in quotation marks. --JN466 04:16, 9 November 2010 (UTC)

Agreed, and it has to be left to editorial judgment whether the phrase is distinctive enough. We can't legislate here for every eventuality. SlimVirgin talk|contribs 04:20, 9 November 2010 (UTC)

The problem is that "distinctive words or phrases may require quotation marks" will be construed in context with the rest of the guideline. We also say that the following types of material aren't subject to citation requirements for purposes of avoiding plagiarism:

  • Use of common expressions and idioms, including those that are common in various sub-cultures such as academic ones.
  • Phrases that are the simplest and most obvious way to present information. Sentences such as "John Smith was born on 2 February 1900" lack sufficient creativity to require attribution.
...

How is an editor to determine whether a statement, which embodies sufficient creative content that it is not exempted from the guideline under the preceding to rules, is nonetheless sufficiently devoid of creative expression to not warrant the quotation marks that may be required? Peter Karlsen (talk) 04:35, 9 November 2010 (UTC)

Editorial judgment--the same judgment we're using here to write the guideline. SlimVirgin talk|contribs 04:43, 9 November 2010 (UTC)
Perhaps, if it were possible to add an actual instance of such a statement, which does require a citation, but not quotation marks, if attributed in-text, to the guideline. The provided example, however, is a deficient one. Bear in mind that "is the most boring book I've ever read" is a common expression, with 42,300 Google hits, and that using this common expression preceded by the name of the book is "the simplest and most obvious way to present information." Why would you use a sentence which, for plagiarism-avoidance purposes at least, does not require a citation at all, as an example of correct in-text attribution without quotation marks? Peter Karlsen (talk) 04:55, 9 November 2010 (UTC)

Example

Peter, I'd argue that you caused a sourcing issue in Purity ball by introducing what's almost OR [20] in an effort to avoid using the source's words, even though there was in-text attribution.

  • Source (Betsy Hart): "Besides, I can't help but wonder if a single-minded focus on virginity is an ironic, and unintended way, of sexualizing youth in a different way."
  • Wikipedia: "Conservative journalist Betsy Hart, while supporting the idea of sexual abstinence prior to marriage, has expressed concerns that the strong focus of purity balls on the concept of virginity may actually sexualize youth, albeit in an unintended way."
  • Peter (with edit summary "rewrite material which treaded dangerously near close paraphrase of the source ...): "Conservative journalist Betsy Hart, while supporting the idea of sexual abstinence prior to marriage, has expressed concerns that purity balls are pervaded by an obsessive preoccupation with physical chastity which may inadvertently instill erotic feelings in the girls attending them, to the detriment of the psychological celibacy which she believes is required by the tenets of the Christian faith."

I can't see where Hart here discusses obsessive preoccupation, the instilling of erotic feelings, or the psychological celibacy she believes is required. I think she might agree with you, but I also think you slightly misrepresented her point, and all in an effort to avoid words that were fine, because clearly attributed. SlimVirgin talk|contribs 04:59, 9 November 2010 (UTC)

Hart doesn't use "pervaded by an obsessive preoccupation" -- she uses "the strong focus". She doesn't discuss "instilling of erotic feelings", but does decry things that "sexualize". "The psychological celibacy which she believes is required by the tenets of the Christian faith" is not mentioned, but later in the source, in the context of discussing the Christian religion, this is: "But what if that same virginal girl has a heart full of bitterness, envy, lust, greed? Would her dad still be proud? Would she? Should they be?"[21] You've unwittingly provided a perfect example of how an excessively rigid interpretation of WP:NOR can squelch original expression, because there's no linguistic similarity between the reference and the text of the article, even though the ideas and information from the source have been represented with substantial accuracy. If we insist on perfect accuracy, then we cannot rewrite at all -- but it's obvious where the Gavin.collins' approach to WP:NOR would lead us. Peter Karlsen (talk) 05:14, 9 November 2010 (UTC)
But being pure of heart in the sense of not bitter and greedy is not the same thing as psychological celibacy. Erotic feelings are nowhere mentioned. And focus is not necessarily obsession. The point I'm making is that, in bending over backwards to avoid the source's own words, or even a close paraphrase, you introduced ideas the source had not mentioned. This is not a rigid application of NOR--you have actually changed what the source said.
The important point here is that it wasn't necessary. Her own words (including a close paraphrase) were fine because they were attributed to her in the text. That's allowed on WP, and it's standard practice outside WP too. SlimVirgin talk|contribs 05:24, 9 November 2010 (UTC)
Are you sure that "erotic feelings are nowhere mentioned"? I may be missing something terribly important, but is lust not an erotic feeling? We deal not with a focus, but a "single-minded focus". Now, one could argue that even that does not rise to the level of obsession, and "pervaded by an preoccupation" would be better representation of source. I've changed the article to that effect. If an editor undertakes to rewrite material in their own words, then sooner or later they will make a mistake (or something that someone might consider to be.) If one cannot ever err, even slightly, in rewriting source material, then one cannot rephrase, period. While the article wouldn't become a copyvio just because Hart was directly quoted, an article composed entirely of quotations or close paraphrase is a copyright violation. Thus, an inability to rephrase in original language (because other editors will pick one's work apart, and say that it's original research because it wasn't done just right) prevents free content articles from being written. Peter Karlsen (talk) 05:41, 9 November 2010 (UTC)
It's true that my rephrasing doesn't cover every point that Hart makes -- nor can any editors efforts to rewrite material ever represent each and every fact. What of bitterness, envy, and greed? It's not there in my rephrasing. Perhaps it should be added. What I would like to call attention to, however, is that someone with a "heart full of... lust" verifiably, and with no original research required, lacks "psychological celibacy" -- yet you assailed this creative and novel rephrasing because it lacks a superficial resemblance to the source text, despite dealing with exactly the same idea. That is precisely the interpretation of WP:NOR that has gotten us into this copyright mess. Peter Karlsen (talk) 06:09, 9 November 2010 (UTC)
No, what I'm arguing is much simpler, namely that you took a sentence that was attributed in-text and rewrote it needlessly. And by doing that you introduced slightly different ideas. But even ignoring that point, the change was still unnecessary, because the paraphrasing with its attribution was fine as it was.
The basic point I'm trying to make here is that we have to keep this guideline simple and clear, and not be encouraging editors to reinvent the wheel, or be terrified that using seven words out of a whole article, even with in-text attribution, is going to land them in trouble, because it won't. SlimVirgin talk|contribs 06:31, 9 November 2010 (UTC)
For any given piece of text, one could argue: rewriting in original language will introduce slightly different ideas, because only the exact words of the author perfectly represent their intent, so why not quote or closely paraphrase? We agree that some quotation is acceptable, despite a disagreement on exactly how it should be attributed. However, when this attitude is extended to an entire article, when we end up with nothing but close paraphrase, then a copyright violation results. I've identified many such problems in articles, such as this one. I believe that over half of purity ball is now text that I personally wrote. Had I been so terrified of WP:NOR that I closely paraphrased everything, it would also be in need of revision deletion, in-text attribution or not. You've expressed concerns over quote farms, and proposed in-text attribution without quotation marks as a solution. However, high quality, free content articles use quotations and close paraphrase sparingly to begin with, and are little in need of avoiding the stylistic interruption of quotation marks where non-free content is present. Peter Karlsen (talk) 07:50, 9 November 2010 (UTC)
You are arguing again under the (imho fictious) assumption, that we are under the obligation to provide a clear formal marker between "fair use" (quotes, properly attributed close paraphrasing) and "pure" CC text parts and hence we need to require our authors/contributors to apply such a marker. It is the first time that I'm reading such an argument at all. Could please point out any guidelines or older discussion, where that was covered and/or where people actually see a need for that? Otherwise you're simply introducing a new aspect, that you personally care about, but where it is not at all clear why that should matter for WP. Also I'd like to see a concrete usage scenario that really legally requires the removal of "fair use" texts.--Kmhkmh (talk) 11:58, 9 November 2010 (UTC)
The need for "clear formal marker between "fair use" (quotes, properly attributed close paraphrasing) and "pure" CC text parts" was discussed at WP:NFCC, where there was an overwhelming consensus for my addition of a delineation requirement to the policy, and clear opposition to SlimVirgin's attempts to water it down using "with quotation marks if appropriate" language, just as she is doing here. Brevity in the use of non-free text is required by WP:NFCC, Wikipedia:Non-free content, and Wikimedia Foundation policy regarding non-free material. Peter Karlsen (talk) 12:49, 9 November 2010 (UTC)
I don't quite agree with your view on that discussion. It rather recent, possibly not even closed, with a fairly limited number of contributors and I don't really see the "overwhelming" agreement either. Moreover we should not be mandating specific writing styles for which there is neither a legal requirement nor direct necessity of our primary project goal (free encyclopedia) without getting some sanction/feedback from the community at large.--Kmhkmh (talk) 13:21, 9 November 2010 (UTC)
  • I agree with SlimVirgin that the rewrite was unnecessary and skewed what the source said. "Sexualising youth" is not the same as instilling "erotic feelings" in them, and "psychological celibacy" is not the same as not having "a heart full of bitterness, envy, lust, greed". This is an important conversation to have, for I don't think that the original wording of the article fell foul of, or should fall foul of, our policies and guidelines. Similarly, I am also concerned that this policy change may lead to article wordings being impugned whose "substantial similarity" to a cited source is due to the fact that they present some of the same (non-copyrightable) facts as the cited source, rather than copying creative expressions used to describe these facts. These matters are important, but we have to get them right, or else we will cause a lot of misery and disillusionment in the community. --JN466 18:00, 9 November 2010 (UTC)
  • That's not a change in policy; that's merely updating one policy to reflect what's both already in the main policy (WP:C) and in practice. As I noted to you at that conversation, WP:C says, "Note that copyright law governs the creative expression of ideas, not the ideas or information themselves. Therefore, it is legal to read an encyclopedia article or other work, reformulate the concepts in your own words, and submit it to Wikipedia, so long as you do not follow the source too closely. (See our Copyright FAQ for more on how much reformulation may be necessary as well as the distinction between summary and abridgment.) However, it would still be unethical (but not illegal) to do so without citing the original as a reference." It's a simple fact that close paraphrasing can violate copyright. --Moonriddengirl (talk) 20:56, 9 November 2010 (UTC)
Jayen, if you don't believe that sexualization involves the instillation of erotic feelings, then how would you explain the meaning of the term, in your own words, without recourse to a definition which begs the question, such this? How would you express the opposite of having "a heart full of... lust"? This is an important issue, because writing ideas and information from a source in original language will always result in some shade of meaning being lost in the translation. If we insist that this is unacceptable, then writing free content articles, using reliable sources which are still under copyright, becomes impossible, due to the inability to ever use original language. WP:NOR is not intended as a means of obstructing Wikipedia's free content and encyclopaedic goals. Original research should not be understood as writing which does not represent the sources on which it is based with absolute perfection, but as material which cannot be supported by sources in any substantial way, at least not without the significant synthesis of ideas and information (not words and text.) Use of Wikipedia to promulgate crackpot physics theories, declarations that people are criminals based on an editor's belief that their (sourced) actions did not comport with the (sourced) requirements of the law, etc, are all original research. Explaining a phenomenon, which a source has described as sexualization, as instilling erotic feelings is not, since the direct, unextrapolated meaning of the reference has been substantially retained. Peter Karlsen (talk) 00:32, 10 November 2010 (UTC)
I genuinely understood "sexualizing" differently when I read her piece. I read it as implying that young girls would be seen - by others as much as by themselves - as beings whose sexuality was their main characteristic, overshadowing all other aspects of their personalities. I do not believe she meant that young girls would have erotic feelings. The other sentence, speaking of "a heart full of bitterness, envy, lust, greed", contrasted purely external "correct" behaviour with an inward spiritual state from which correct behaviour flows naturally, making the point that external conformism cannot be a substitute for having a pure heart. I see lust here as just one of various cardinal sins that may be in a person's heart, even though the person's outward behaviour may give the appearance of piety, and do not understand it as harking back to the earlier mention of "sexualising".
My concern is the same as yours, that no policy should obstruct Wikipedia's free content and encyclopaedic goals, including the goal to accurately describe the sum of human knowledge, but where you see threats from NOR policy, I see a threat from overly restrictive copyright and plagiarism policies. I genuinely do not believe that the original article wording, above, presented a valid copyright, plagiarism, or close paraphrasing problem. If editors are told that they must not recount facts they found in a source because they cannot do so without using some of the same words as the source, that interferes with Wikipedia's core mission. --JN466 02:33, 10 November 2010 (UTC)
Any source relating nuanced material and opinion, rather than bare facts, could probably be interpreted in several slightly different ways. However, rewriting reference material in original language requires that an editor select a definitive construction, since it is very difficult to describe ideas and information if one isn't sure of what they are (though doubtless many users have tried.) Once such a choice is made, it's open to critique by editors whose understanding of the source was a bit different. Direct quotation or close paraphrase might seem like an easy way to avoid such problems. For any single piece of information, it will. But when entire articles are written using this method, copyright violations result.
In a healthy editorial environment, any slightly biased interpretation of sources by an editor initially adding original language -- myself, in the case of purity ball -- is corrected by subsequent contributors, who collectively move the text in a neutral direction while maintaining its free content status. Genuinely libelous or other highly WP:BLP-violating material, such as "X was arrested and charged with public drunkenness", when no reliable sources reported the arrest or charges, would involve statements of plain facts, and will almost certainly never enter articles solely due to an editor's choice of legitimate interpretations of a reference text. A recent, disturbing trend, however, is to attempt to remedy any perceived bias in original language by adopting the exact words of the reference, or closely paraphrasing it. For a particularly problematic or highly disputed issue, this might be the best solution, provided that correct attribution is applied. But for when used for the bulk of content, quotation and close paraphrase are unacceptably non-free, and may easily constitute copyright infringement.
This guideline, Wikipedia:Copyright violations, and WP:NFCC do not instruct contributors that they must not use any words from the source. It is only when the language copied embodies creative content that this even becomes an issue, and even then only when the copying is either excessive or incorrectly cited. However, articles composed entirely of copied non-creative expression would inevitably be of the most banal and inadequate variety. Could we write "A train crashed near Mumbai. Fifty people were injured." even if the words were lifted straight from different parts of the source, without the attribution of quotations, or any copyright concerns? Absolutely. But Wikipedia should not be reduced to this level of pabulum. Peter Karlsen (talk) 04:03, 10 November 2010 (UTC)
The article I had in mind here was 2007 Samjhauta Express bombings, a featured article (not written by me), which was discussed here on this talk page at the time this was adopted as a guideline. If I remember correctly, editors (me included) found that the article followed the source wording unnecessarily closely at times. When the bulk of an article's content copies the creative (as opposed to factual) language of its sources directly, or through very close paraphrase, and/or copies the creatively defined structure and presentation of the underlying source, we have a problem. On the other hand, if you look at the 2007 Samjhauta Express bombings FA and its sources (which might not be a bad idea, as there may still be some unnecessarily close paraphrases in there), you'll also see that the facts are of such a nature that it's often not possible to craft a description of the facts that completely avoids what another editor might consider close paraphrasing. Perhaps we should give some examples on a page somewhere to help editors identify the difference between factual similarity and similarity in creative expression, just like we give examples of WP:SYN. --JN466 12:14, 10 November 2010 (UTC)
Does this edit address your concerns regarding accurate representation of Hart's opinion of purity balls? Peter Karlsen (talk) 02:07, 11 November 2010 (UTC)
I wouldn't have written it in that style, but I think it more accurately describes what she meant, yes. The phrase "young women generally regard the prospect of participation in father-daughter dances with disgust" still worries me; it seems a long way from "most teenage girls would not be caught dead dancing with their dads". It is surely not so much a matter of disgust as it is a matter of being seen to be doing something profoundly uncool. Teenagers try to separate from their parents and establish their own independent identity; many teenagers simply don't like to be seen out and about with their parents, whether it is going shopping, going to a cinema or restaurant, or dancing. --JN466 01:48, 13 November 2010 (UTC)
We need to avoid original research through projecting our own understanding of typical teenage behaviour onto a source. The term "sexualize" is used both in the sense of altering internal psychology, and changing social views, so we might say that either interpretation is a valid restatement. But there's nothing at all in "most teenage girls would not be caught dead dancing with their dads" that actually describes the avoidance of strongly negative social perceptions. Invariably, "wouldn't be caught dead" is used in the sense of personal conduct and belief, regardless of the reasons. To draw an inference as to what those reasons are, based on information external to the reference, seems to veer into OR synthesis territory, unless the context in which the phrase is used permits only one reasonable interpretation. But in this case, the NYT writer might easily be referring to a distaste for participation in activities which are generally considered to be romantic, such as social dance, over and above any views of more innocuous father-daughter socialization. We simply don't know why the writer claimed that young women "wouldn't be caught dead" doing this - all we really have to work with is the meaning of the phrase itself. "Wouldn't be caught dead" denotes more than mere unwillingness, but also a sense of revulsion at the whole matter. Disgust is a fair representation of the attitude that the writer is describing young women as having towards father-daughter dances. Peter Karlsen (talk) 23:34, 13 November 2010 (UTC)
I am not sure I agree. "Wouldn't be caught dead" means one would not like being caught by others in the process of doing something; it is about fears that one will be seen by others in a certain way because of something one does. It is about embarrassment. If a girl says, "I wouldn't be caught dead wearing that dress", this does not imply that she considers wearing the dress a disgusting experience, it implies that she fears others will think her badly dressed. It's about how one is going to be perceived. This is an interesting discussion, because it exemplifies the different interpretations editors can come to, and reflect in their rephrasing. In this case, I would rather quote the original writer verbatim, especially since she is using a very idiomatic expression. --JN466 01:13, 14 November 2010 (UTC)
You're right, according to the definition you cite, but not all reliable sources agree [22]. I've revised the article based on the common factor from the two definitions [23]. This sort of issue, in which editors will interpret the same term differently because it is employed in multiple senses, and its usage is not exactly a model of clarity, is why it is advisable to have multiple users actively working on an article, not a justification for direct quotation of all statements of opinion (which would present a copyright problem on any article that primarily represents attributed opinion, rather than bare facts.) Peter Karlsen (talk) 02:04, 14 November 2010 (UTC)
By the way, I've qualified the edit to the copyright violation policy with which you were concerned [24] to avoid misinterpretations, such as your claim that the policy would forbid editors to "present some of the same (non-copyrightable) facts as the cited source, rather than copying creative expressions used to describe these facts." Peter Karlsen (talk) 00:48, 10 November 2010 (UTC)
Thanks, that is much appreciated, though I believe we still need to go further and mention the concept of "creative expression". Saying that a "train crashed near Mumbai" will have a linguistic similarity to a source saying that a "train crashed near Mumbai", but despite the linguistic similarity, there are no creative expressions involved. --JN466 02:23, 10 November 2010 (UTC)
That's been done now. [25] --JN466 12:28, 10 November 2010 (UTC)
As far as I understand, Nimmer says that paraphrasing can rise to the level of substantial similarity if the fundamental essence or structure of one work is duplicated in another. This is not what happens when a Wikipedian closely paraphrases a line, or even a paragraph, from a book: they are not duplicating the fundamental essence or structure of a work here. By not presenting such qualifications, but merely saying that "close paraphrasing may constitute copyright infringment", we risk a lot of Wikipedians being impugned unjustifiedly, and we conflate the problem of real plagiarism and copyright infringement in Wikipedia with legitimate forms of editorial practice. This topic is high in the community's consciousness right now, and I can't imagine many things that would have a greater corrosive effect on this project than a spate of ill-founded accusations of plagiarism and copyright violation directed against editors. We will have people leaving in droves. --JN466 22:56, 9 November 2010 (UTC)
You are misunderstanding; I have replied ot you in more depth at the policy page. --Moonriddengirl (talk) 23:43, 9 November 2010 (UTC)
Okay, let's carry on that conversation over there, so it remains in one place only. --JN466 02:19, 10 November 2010 (UTC)
  • This is a helpful addition. --JN466 15:41, 10 November 2010 (UTC)
    • I'll agree that edit is helpful in introducing a little more nuance to the topic. I'm a little confused as to the basis for this dicsussion between PK and SV. It seems to me that if you are using someone else's exact words, quote marks would just be a courtesy you normally extended, to indicate thse are words you didn't think of yourself. I'm not clear on how it's aesthetically displeasing, though I do appreciate some of the argument re: naturally flowing text. But if the words are not your own, do they flow naturally anyway? If en:wiki ends up as a quote farm, then so be it. Paraphrasing is a different matter, acceptable IMO unless you make the editorial decision to preserve a distinctive phrasing, which to my mind you should more directly attribute, i.e. quotes. I do see the desire to write a readable article, but it's more complicated than "always use quotes" or "don't worry about it". Franamax (talk) 22:37, 11 November 2010 (UTC)

Source of the article : Wikipedia

Comments
0 Comments