Comments

The following is an automatically-generated compilation of all talk pages for the Signpost issue dated 2012-04-23. For general Signpost discussion, see Wikipedia talk:Signpost.

Arbitration report: Evidence submissions close in Rich Farmbrough case, vote on proposed decision in R&I Review (858 bytes · 💬)

I don't know much about the Rich Farmbrough case or its history, but when someone suggests retroactively changing the case, I get concerned. What possible reason could there be to suggest such a thing? If there's a serious problem, you {{cblank}} it. If there's only a minor problem, you ignore it. I'm just trying to make sense of this. Can anyone else enlighten me? --N Y Kevin @081, i.e. 00:56, 25 April 2012 (UTC)

Featured content: A mirror (or seventeen) on this week's featured content (0 bytes · 💬)

Wikipedia talk:Wikipedia Signpost/2012-04-23/Featured content

Investigative report: Spin doctors spin Jimmy's "bright line" (102,851 bytes · 💬)

Too defensive

I'm sorry, but this draft article itself looks like PR spin. It's also a very one-sided reflection of the discussions at CREWE ... and there is nothing about how best to address such errors as Wikipedia does contain, as we all know, whether it's in 60% of company articles or 40%. Overly defensive, and transparently so. --J N 466 15:55, 23 April 2012 (UTC)

This passage, "excluding the don't knows, which boosts the factual error rate to 60% (column A in the table; 406/273) raises difficult issues. Including the don't knows would yield 41% (column B; 406/[310 + 273]). This problematic calculation was independently confirmed by Tilman Bayer (HaeB) of Wikimedia's communications team, who has postgraduate degrees in mathematics and co-edits the Wikimedia Research Newsletter." is risible. Are we trying to blind people with science? This is basic numeracy. What does "including the don't knows" mean? It means that we assume that every one of these don't-knows would find upon scrutiny of their article that there isn't a single error in it. Is that a reasonable assumption, or does it possibly look a bit self-serving? Is that why we are selling it with an appeal to authority?

The argument that the "real" error rate would have to be somewhere between 60% and 41% is likewise mathematically unsound. As we've seen, for it to be at 41% would require that every single don't know converts into a no-error-found response if they sit down to check the article in detail. For the upper bound you would have to consider the opposite extreme, i.e. that each of the don't-knows does find at least one error in the article, once they sit down with it and check it in detail. That would add the don't-know responses to the yes responses: 406+310/406+310+273, for an overall error rate of 72%. So the mathematically possible range is 41% to 72%, not 41% to 60%. And much the most likely scenario really is that the don't-knows would break down 60/40 as well, following roughly the same pattern as the others who were certain one way or the other. 60%, rather than 72% or 41%, is the best estimate for the number of articles PR professionals consider to contain an error. --J N 466 16:28, 23 April 2012 (UTC)

I tend to agree that the tone should change -- even the first sentence, "misleadingly" should be changed to something like "alarming" or "sensational," rather than passing judgement right away. Also, it should be clearer that the actual numbers/stats are actually from DiStaso's report, either by italicizing or otherwise. It's too easy to miss the quotes. And actually, I would say the press coverage has been "modest" since the biggies (NY Times, WSJ, CNET, ZD Net, etc) have NOT taken the bait on this story. In fact, I was called by some reporters asking me if I'd seen it, and whether it smelled right. And to their credit, I told them their instinct was right -- the report was very flawed and the headline too sensational. Notice that most of the repeated reports are from a news wire story from India, and not the main ones (AP, Reuters, etc). -- Fuzheado | Talk 16:32, 23 April 2012 (UTC)

Marcia's study smells a lot better than this article. --J N 466 16:57, 23 April 2012 (UTC)

Frankly, if she allows her PR agent to say "Survey finds majority of Wikipedia entries contain factual errors" then she should have better evidence of this than a survey of 989 PR professionals, 41% of whom say that they believe that they can find an error in an article they have an interest in, 28% saying that they believe that they can't find an error, and the rest saying that they don't know if they can find an error. Even the 41% is suspect - since they only believe that they can find an error, rather than actually finding one. But I've had my say in the article above, DiStaso has had her say in her article. It's time to let the readers decide. Smallbones (talk) 17:14, 23 April 2012 (UTC)

(e.c.) JN466, first, if the researchers had wanted to avoid creating a pool of 25% of respondents who said they didn't know whether there were factual inaccuracies in the relevant article, why give them a don't know option? It would be normal research practice to ask them to actually read the article and answer yes or no, and preferably to specify what was inaccurate. Because opting out was specifically allowed, there's a void in the data, and the black hole cannot be simply excluded from the sample, because then it's impossible to know the meaning of the result. Second, if the college's press release hadn't begun the fiction that 60% of WP articles have factual errors, journalists alone could be blamed for the hysterical interpretation; but this is not the case.

Tony (talk)

17:20, 23 April 2012 (UTC)

It's not a fiction, it's what 60% of the people who had checked the article for their company or client said. You can't just come along and massage the figures by ascribing whatever result you like to the 25% who weren't prepared to comment. If that is not clear to you, I really cannot help you. Why enable a don't know option? Because respondents might not have time to check one way or the other. A don't know option is standard survey technique, and you are trying to make it into something nefarious. Beyond that, the study describes what sort of thing people found to be inaccurate (p. 9, top). J N 466 19:01, 23 April 2012 (UTC)

You're right, the don't know option was reasonable, but this is a strawman argument - that option wasn't the problem here, rather, the problem was trying to draw too many conclusions from incomplete data. Regards, HaeB (talk) 01:26, 24 April 2012 (UTC)

I am quoting from the paper, and this article: "When asked if there are currently factual errors on their company or client’s Wikipedia articles, ..." They weren't asked if they "believe they can find one in an article they have an interest in". The wording you're using is neither in the original study, nor for that matter in this article. Are you now just making things up? Lastly, 60% of those familiar with their article said there were such errors. This is the result of the survey. It appears to wound you in some way to hear this, but that is what respondents said. I see nothing whatsoever wrong with saying "Survey finds majority of Wikipedia entries contain factual errors". It is one survey -- another survey using a different demographic might have a different result -- but this survey most assuredly had this result. Don't flatter yourself that there is any parity between this effort here and DiStaso's study. You wouldn't be able to get this piece published anywhere, other than by self-publishing it in Wikipedia. J N 466 19:01, 23 April 2012 (UTC)

"The wording you're using is neither in the original study, nor for that matter in this article. Are you now just making things up?" - In his comment Smallbones clearly marked the quotations, and this passage was not among them, so the rather extreme insinuation that Smallbones was fabricating quotes is baseless. Apart from that, it is entirely reasonable to assume that if a respondent replies yes to the question if there are currently factual errors in a certain article, then this respondent believes they can find (and name) such an error, and vice versa. However, a respondent saying so is quite different from actual naming an alleged error, and that in turn is quite different from the researcher confirming the claimed error.

"I see nothing whatsoever wrong with saying "Survey finds majority of Wikipedia entries contain factual errors" - well, then it really seems you got carried away in your zeal regarding this topic. Not even DiStaso' paper itself makes this claim (for one, the survey only concerned article about companies, a tiny part of "Wikipedia entries" in general).

Regards, HaeB (talk) 01:26, 24 April 2012 (UTC)

JN466, I am not the author of this article, but since you are attacking a passage where I am mentioned by name as "risible", can I ask if you really have serious objections to my wording that it refers to ("Survey: 41% of PR professionals with Wikipedia article on their company/client say it contains errors")? If not, I would like to request you to be a bit more considerate and specific when wording your emotional outbreaks. BTW, the appeal to authority was the author's idea alone, and actually I don't think it is needed here.
"What does 'including the don't knows' mean? It means that we assume that every one of these don't-knows would find upon scrutiny of their article that there isn't a single error in it. Is that a reasonable assumption, or does it possibly look a bit self-serving?" - This seems another strawman argument to me, as I don't read Tony's wording as making that assumption at all. I merely read it as demonstrating that the "60% of company articles contain errors" claim relies on an unfounded assumption, by showing that if one replacing it with another assumption (which would also be unfounded, as you are correct in pointing out), one arrives at a very different number. Tony clearly states that "The true percentage" cannot be derived from the the data, yet you are forcefully accusing him of making such a conclusion. While his conjecture that "no doubt [it] lies somewhere between these two values" is considerably weaker than the 60% claim in the press release bearing DiStaso's name as contact, it's still conceivable that the true value lies outside, say 20% or 80% (because of other unproven assumptions not addressed in that passage). And yes, if one actually proceeds on the unproven assumption that all the error claims and error-free claims can verified, disregards the sampling biases, etc., the upper bound would eventually be above 60%. Summarily, I think Tony should remove that sentence - even if if would be made more precise regarding the two limits, it would still buy too much into DiStaso's (press release) assumptions.
"And much the most likely scenario really is that the don't-knows would break down 60/40 as well" - such waffling, handwaving guesswork and glossing over unproven assumptions may be acceptable in informal conversations, but not in serious research and not when arriving at numbers for the use in headlines. By the way, it is much harder to arrive at certainty that an article is entirely error-free (if it is), than finding at least one error (if it does contain errors). It is entirely conceivable that the don't-knows are largely people who have read through the article and have found no errors, but didn't have the time to fact-check whether the Ecuadorian branch of their company opened in 1953 or 1954.

Regards, HaeB (talk) 01:26, 24 April 2012 (UTC)

Don't knows are defined as people who had an article, but did not know it well enough at the time to say. This is essentially the same as a population of 989 boxes containing white or black marbles, of which we have opened 679, and found black and white marbles at a proportion of 60 : 40. I am sure you know the relevant sampling distribution; the sampling result obtained from opening the first 679 boxes is an unbiased estimator of the proportion to be found in the population as a whole, including the remaining 310, as yet unopened boxes.

This is sampling basics: if you observe a male / female births proportion of 51 / 49 etc. while sampling 10% of all births in the population, you hypothesise that that male / female ratio is characteristic of births in general. What you do not do is crucify a researcher who reports this proportion by saying, Aaaaah, you only sampled 10%, so the real male birth rate is 5.1%, not 51%! The other 90% which you did not sample might all have been girls! It would be nonsense to say that the researcher "boosted" the proportion of male births to 51% by omitting the other births.

Yet this is exactly what the text I criticised above said. It spoke of a "factual error rate", not of a proportion of respondents. By the way, the statement you made on Twitter is equivalent to saying, "5.1% of children born were found to be male!" Yes, it is technically correct, because in 90% of births we did not determine the sex, and thus cannot say anything one way or another, but it is still profoundly misleading as an indicator of the relative frequency of male births.

And while we are arguing about the numbers – errors in 41% of articles? 60%? – there is no acknowledgement of the errors we do have. There is just an effort to make the problem go away, deny that it exists, and not to engage with it. Regards. --J N 466 02:25, 24 April 2012 (UTC)

I am struggling to find a polite way of saying it, but these claims ignore some of the fundamentals of statistics. Basically, you seem to be denying that sampling bias could ever exist. The marble example you describe relies on the assumption that boxes are opened in a truly random fashion, or to put it differently, that the inclusion of a box in the sample is completely independent from its content. As discussed above, there is no basis for making the same kind of assumption here (and it is actually quite plausible that an error-free article might have a higher likelihood of generating a "don't know" response). And while the birds example is probably closer to the marbles example than to the present case, an ornithologist who publishes such a conclusion based on, say, a 10% sample drawn from observation only during a particular time of the year might very well become a victim of sampling bias, cf. [1] and Trivers–Willard hypothesis.

"there is no acknowledgement of the errors we do have. There is just an effort to make the problem go away, deny that it exists, and not to engage with it" - I haven't seen anyone claiming that the error rate is 0%. Of course you are right that such problems exist, but Wikipedians have been acknowledging and discussing them since the project's inception, and they can't be a reason to refrain from criticizing a flawed methodology.

Regards, HaeB (talk) 04:25, 24 April 2012 (UTC)

Nothing against ornithologists, but you do realise I was talking about births above, not birds? ;) Sampling bias is bound to have occurred, as the respondents were self-selected (like Wikipedia editors flocking to an article), and this should be said -- "People who have found errors in their Wikipedia article in the past may have been more likely to have had an interest in participating" – but that's about it. Note that some of these people would be among those who said there are not currently any errors in the article, because, unlike the don't knows unfamiliar with their articles, they had fixed them. We are in happy agreement that the sentence should go; let's leave it at that. Regards. --J N 466 10:38, 24 April 2012 (UTC)

Right, birds was just the example I happened to be aware of for this phenomenon (of course I was aware you were talking about human boys and girls; not that it matters). I'm glad we now agree about the possibility of sampling bias, but as discussed above, there are several other sources besides that one. The sentence has been modified to one to which my original concerns no longer apply, if you can live with it too and your other objections have been addressed, that's great. And again, let me state that it is of course legitimate to point to errors in certain kinds of articles, or to advocate for a certain position regardin paid/COI editing. However, I feel it is also legitimate to point to methodological shortcomings in a study about Wikipedia without having to embark in discussions about those other topics. Regards, HaeB (talk) 11:18, 24 April 2012 (UTC)

Interesting example, too. The sentence I do have a problem with is "Including the don't knows would yield 41% (406 / [406 + 310 + 273])." I see that as ethically far worse than what the article is accusing DiStasio of, because while the 60% might plausibly be an accurate value for the entire population included in the study, if "don't knows" were found to break down in the same way as the rest of the population, the 41% most definitely is not. There is no plausible scenario whatsoever in which every single "don't knows" would become a "no error found" if they sat down and checked their article. Regards. --J N 466 11:50, 24 April 2012 (UTC)

My problem with DiStasio's paper is not the work itself, which we all know is a transparent attempt at leveraging Wikipedia for the advantage of her collective "clients", so I'm not surprised that the majors have taken a flier on this as a news story. There is some merit in what she says, because many of our articles are pretty atrocious and biased in favour of blowing up controversies, but that may be a bit of a red herring for the purposes of this discussion. DiStasio's study has been specifically constructed to further her organisation's objectives; it has been glossed and released to the media with the typical PR spin on the statistics you see that appalled Benjamin Disraeli. Perhaps we ought not to feed the smouldering cinders with oxygen. --Ohconfucius ^¡digame! 02:50, 24 April 2012 (UTC)
Agreed with Ohconfucius. The survey was self selected: the conversation can stop there (but obviously it didn't); the number is meaningless. PR people don't generally like, agree with or even understand our content policies. Most print newspapers didn't run the story; those that did would probably be put to better use lining kitty litter boxes. Many PR people have either seen Wikipedia take a buzzsaw to their carefully crafted promotions, or seen it happen to other PR people. WP:Assume good faith doesn't apply to people outside Wikipedia who see themselves as Wikipedia's opponents or competitors; the more rational expectation is opposition or competition. - Dank (push to talk) 12:41, 24 April 2012 (UTC)

Bright line

Re: "a reference to the boundary that people with a conflict of interest in a topic should not cross under site policy." Jimmy Wales' rule for 'no direct editing' is not policy. COI only prohibits edits that promote an editor's interests above Wikipedia's; otherwise, it permits at least uncontroversial changes such as to spelling, grammar, statistics, etc. Also, COI is not a policy, it's a guideline. Ocaasi^{t | c} 16:05, 23 April 2012 (UTC)

Yup; I tweaked the wording. Thanks. $Tony (talk)$ 16:22, 23 April 2012 (UTC)

It's worth pointing out that CREWE/PRSA/IPR/etc. are advocating against a rule that doesn't exist, asking an authority to change it where there is none. The stated objective of the report is literally un-achievable, because they are asking to "change a rule," but what that really implies is the eradication of a mere idea. I have to assume the bright line's status as "not policy" has been explained to CREWE plenty of times already, so why are they so aggressive about continuing a campaign based on knowingly false information? User:King4057 (COI Disclosure on User Page) 01:17, 25 April 2012 (UTC)

2010 study reference

You question Marcia DiStaso's credibility as a researcher by writing the following: "The Signpost notes that in a 2010 article in the same scholarly journal, the starting point contain hints that companies are doing it hard on the English Wikipedia: ..." I would like to point out that Marcia DiStaso's and my 2010 study (she did not write it by herself by the way) is wrongly used here to undermine her credibility. Please take a look at our conclusions in which we told PR pros to pretty much keep their hands off negative Wikipedia content and engage in conversations with Wikipedians, except for editing minor mistakes (which we thought was the policy when the paper was written). The main advice we gave back then was to keep an eye on the Wikipedia articles. That's standard PR procedure for other media formats. I'm not sure why that's controversial and would be used to undermine her credibility. Can you please clarify. Socialmediaprofessor (talk) 18:57, 23 April 2012 (UTC)

There's certainly no problem with making these conclusions in the 1st study. The main problem, as I see it, is repeating some of these conclusions while recruiting survey participants, which seriously biases the sample, and then making essentially the same conclusions based on this biased sample. It looks like a self-fulfilling prophesy to me. Smallbones (talk) 21:53, 23 April 2012 (UTC)

It is interesting to observe the selective quoting from that study. For example of this paragraph,

The very nature of Wikipedia forces transparency on an organization. The simple concept that any fact about an organization can be placed on a corporate article and not be removed if it is factual and has a citation, puts companies in a challenging situation. Oftentimes, companies have information on their corporate articles that they would rather not highlight, such as corporate scandals or lawsuits. For the public, this is a goldmine of transparency providing easy access to all types of information about a company allowing individuals to make truly informed decisions about that company.

(my emphasis) the authors of this present Signpost article quote only this:

"oftentimes, companies have information on their corporate articles that they would rather not highlight, such as corporate scandals or lawsuits"

This type of lying by omission is quite commonly employed here for propaganda purposes. It is a typical failure mode of Wikipedia. --J N 466 20:04, 23 April 2012 (UTC)

"Lying" and "propaganda" are totally out of line. Smallbones (talk) 21:53, 23 April 2012 (UTC)

I'm not a fan of that section and paragraph and think that it should be deleted altogether, simply because it's a non sequitur. It's not an accurate portrayal of the paper. But that clipping is not "lying" by any sensible definition. The most useful part of that 2010 paper is what User:Socialmediaprofessor said: "public relations practitioners have the same opportunity as anyone else to write and edit. There is nothing wrong with weighing in on the discussions that hundreds of Wikipedia editors have every day. It would only be wrong to do this anonymously or to hire someone else to do it “undercover.” This would not only constitute unethical behavior, but would most likely cause a much larger online crisis than the content on Wikipedia will ever do." -- Fuzheado | Talk 22:10, 23 April 2012 (UTC)

I'm a bit new at this, and Tony should ultimately decide (but he's probably asleep). I've removed the section subject to his approval. The article is too long in any case, and it seems that the section is either off-point or being misinterpreted. Smallbones (talk) 22:20, 23 April 2012 (UTC)

Quick analysis

Hi! I'm a bit concerned that some of the problems identified in the article aren't the best choices, and distract from what I think are the more significant problems with some of the findings. I have three main concerns with the paper.

First, my concern with the 60% figure isn't the same as that which others have expressed in the article. Fundamentally, my concern is that figure could not be derived from the collected data as presented. According to the paper, the question asked of respondents was:

"When asked if there are currently factual errors on their company or client’s Wikipedia articles ..."

The problem is the use of a plural for "their company or client's WP articles". That means that the figure that the question collects is the percentage of people who were able to identify errors in one or more of the articles they looked at, not the percentage of articles that contained errors. To use an alternative example, let's say that I identified 10 articles, only two of which contained errors. Then I asked 100 people to look at those articles and tell me if they found any errors, with 90% of repondents reporting that they found at least one error in the articles. I can reasonably state "90% of respondents found at least one error", but I cannot say "90% of articles contained an error". To do that I would have to ask a) how many WP articles did each respondent look at (presumably 10), and b) how many of those contained errors. But the paper doesn't state that the survey tool asked those additional questions. So either the question as expressed in the paper is incorrect, there was other data collected that wasn't stated in the paper which identified particular articles, or the "60% of wikipedia pages about companies contain errors" could not have been derived from the data.

My other concerns are easier to express. The second is sample bias. This is always a problem with online surveys, but the problem here was the use of CREWE. CREWE have a clear bias, yet they can be expected to have answered the questions and were invovled in finding respondents. In the acknowledgements it states:

I would like to thank the members of Corporate Representatives for Ethical Wikipedia Engagement (CREWE) for fine tuning the questionnaire, spreading the word to encourage participation and working to improve the problems with public relations/communications Wikipedia editing.

Given the sample size (1284) and the number of people who are part of CREWE (294, not all of whom are PR professionals), this may consititute significant bias that isn't addressed in the paper.

Finally, and most fundamentally, the paper starts with the claim "This study found that the “bright line” rule as co-founder Jimmy Wales has called it, is not working." The problem is that there is no bright line rule. Jimmy has advocated for one, but it doesn't exist. So to write a paper exploring the effectivness of a non-existent rule based on the assumption that the rule exists is a significant error.

I think there is real value in the paper's findings, but the problem is that the errors - in particular the 60% claim and the assumption that the Bright Line rule is currently a policy or guideline on WP, tends to hide the valuable figures. - Bilby (talk) 00:31, 24 April 2012 (UTC)

I agree with all you say - but there is a whole menu of problems, which to pick to discuss in a limited space might just be a matter of taste. Sample bias - again I agree - but which examples to choose? Smallbones (talk) 01:04, 24 April 2012 (UTC)

Rather than shooting from the hip, I'd suggest you discuss these points with Marcia first. For example, the plural form "articles" in this sentence:

When asked if there are currently factual errors on their company or client’s Wikipedia articles, 32% said that there were ...

occurs in a narrative summary of survey results. It is not marked, and is not likely to have been, a verbatim quote of the actual survey question. J N 466 01:43, 24 April 2012 (UTC)

I'm aware of that, which is why I think there are three possibilities - she asked a different question than what she described; she asked questions not included in the paper that allowed the information to be derived; or she asked the question as described, but the results she claimed could not be derived. The problem from the point of view of analysis is that I can only go on what was published, and what was published could not be derived from the information presented in the article. I wish the survey tool was available. :) - Bilby (talk) 01:52, 24 April 2012 (UTC)

You can simply ask her, over on the Facebook group. http://www.facebook.com/groups/crewe.group/ Or Socialmediaprofessor above can ask her for you. --J N 466 02:35, 24 April 2012 (UTC)

I'm happy to do so. But, generally, we need to analyis research findings as presented - if we need to rely on further clarification to establish core claims then things get a bit tricky. - Bilby (talk) 02:50, 24 April 2012 (UTC)

NPOV

Could you at least make an attempt to reflect discussion at CREWE neutrally? The opinions you have stuck into your call-out boxes are all from one side of the debate that was had at CREWE. --J N 466 02:35, 24 April 2012 (UTC)

60%

In my consideration (and that of several other editors who advised on the preparation of the article), the 60% claim is way out of line. All that Dr DiStaso was entitled to say was that 406 of the 989 respondents whose client or company had an article felt that there was at least one factual error. This is 41%. She might have pointed out that a proportion of the don't knows might also have perceived an error, had they been required to actually read the article they claimed to be associated with. A note that 24 respondents didn't tick any box for this question might also have been included.

Research is necessarily conservative, cautious. You try to prove the null hypothesis, not the hypothesis, and if you fail, you've succeeded in an inverted way. Researchers are likely to be reviewed negatively where they give the green light to a press release that makes scientifically false claims, based on a methodology that needed questioning before the trials took place. It's most unfortunate that respondents were not asked (in confidence) to specify the article, so that Dr DiStaso's research assistants might have verified and expressed in precise qualitative terms what the "factual errors" were in each case: how many, of what type, and how serious. The generalised multiple-choice question on type of error was never going to provide convincing support for the findings that were trumpeted in headlines, since there was no specific connection between the data for each.

I can only agree with Bilby's points about sample bias in encouraging members of CREWE to participate, with an information environment that was highly likely to contaminate in the encouragement. ITN has already pointed out that selection bias was likely to be significant just in the "call to action" environments in which invitations to particate were made. The very existence of CREWE, indeed, meant that some effort needed to be put into countering selection (and self-selection) bias.

This is not a credible study, although it does provide some interesting and possibly useful things for the movement to look at, with caution. $Tony (talk)$ 03:29, 24 April 2012 (UTC)

Of course it's a credible study. You just don't like the way this one data point (one out of many, all the rest of which are not being discussed at all!) was summarised, because you are identified with Wikipedia. There is not just one "right" way of summarising data. She said, "60% of Wikipedia articles for companies and clients of respondents who were familiar with them had factual errors". That's true. If you summarised it as above, focusing on 41%, someone would tell you that you artificially lowered the percentage to make Wikipedia look better, by including people who hadn't even checked what was in their article. Glasses are half full or half empty, depending on your point of view. Accusing someone of the opposite point of view of "not being credible" seems to me a rather intolerant attitude. J N 466 11:12, 24 April 2012 (UTC)

Just to be clear, based on what was provided in the article, no, it is not true that 60% of Wikipedia articles on companies had errors. It is true that 60% of respondents who were aware of the content of their or their client's articles identified errors in at least one article. I understand that it may be the case that the information presented in the article did not reflect the actual survey tool, but as it stands we have to assume that the author is mistaken in making the "60% of articles" claim. - Bilby (talk) 11:27, 24 April 2012 (UTC)

Did you ask the author about this? From reading the study, my impression was that respondents were asked to comment on one article, that for their own company, if working in-house, or that for their current client, if working for an agency. They were not instructed to respond "yes" if they were aware of a single company article in Wikipedia that contained an error. J N 466 11:44, 24 April 2012 (UTC)

Not yet - today's my heavy lecturing day, so I've been caught up at uni for most of the day. However, I'm not sure how you would get a different impression from the article as it stands. The relevant part states:

"When asked if there are currently factual errors on their company or client's Wikipedia articles, 32% said that there were ..."

If that is an accurate reflection of what was asked, then it is clearly plural. In which case I think you would answer yes if only the article about you had an error or if there was an error on only one of your client's articles, even if the other relevant articles were ok. Which is why that question, as written, doesn't equate to 60% of articles. Of course, if that isn't an accurate representation of what was asked, then it might be different. - Bilby (talk) 12:20, 24 April 2012 (UTC)

The 60% claim is utter nonsense, and no amount of extrapolating, speculating or hypothesising will change it. Mars Inc. ran into the same problem with what was deemed by advertising watchdog to be misleading claim concerning Whiskas. --Ohconfucius ^¡digame! 01:27, 25 April 2012 (UTC)

English grammar, my friend. "The dogs wagged their tails", even though each dog only wagged one. There are also singular occurrences of "article" in the study, e.g.: "Seventy nine percent of respondents had a Wikipedia article for their company or recent client (n=989)", "In other words, 60% of the Wikipedia articles for respondents who were familiar with their company or recent client’s article contained factual errors.", "errors currently on their company or client’s Wikipedia article", etc. Don't go off half-cocked: check with the author how the question was phrased, and then say what you want to say. You'll either be proven wrong, or you will have a much stronger argument. --J N 466 13:19, 25 April 2012 (UTC)

NPOV, please

This looks like a POV essay. Try reporting the STORY instead of writing an opinion essay. There is no Wikipedia policy or guideline called "bright line" with respect to COI editing, it is a creation from the mind of Jimmy Wales, without a corresponding connection to WP's current doctrine. We've all got opinions, that's his. I think he's wrong. That's not the story — the story is the publication of the survey itself and its conclusions — and there may well have room for critique of the methodology. But do bear in mind with 1200 respondents, even if this is an unscientific accumulation of anecdotal evidence and personal opinions of PR pros, it has value even as that.

The main criticism is with the misinterpretation and erroneous headlines used by a few of the bimbos in the commercial press, not with the study itself. The only criticism I have with the study itself is the fact that its author placed far, far too much credence in Mr. Wales opinions, elevating these to some sort of actually applicable guideline at WP in the last few pages of commentary. That's wrong and publishing that was disinformative. Carrite (talk) 04:08, 24 April 2012 (UTC)

Bear in mind that I presented a strong proposal at the paid-editing debate that paid editing be allowed, completely, and that only bias in the text should count. It's a view I still hold., and is directly at odds with Jimmy's view.

Tony (talk)

04:16, 24 April 2012 (UTC)

This Signpost story, as written, is pretty sad. It's extremely defensive and seems to be making original research conclusions, along with including POV wording. Maybe you should note that there were a number of PR CREWE members who were upset with the study as well, rather than making it seem like CREWE is in full support of the study (and thus lying by omission). Silverseren^C 04:17, 24 April 2012 (UTC)
- Lying is a pretty serious accusation. I reported at length the comments of one prominent CREWE member, Andrew Lih. The text is already rather long. What is the piece "defensive" of (if not proper research standards)? And again, I must remind you that I am explicitly in favour of allowing paid editing on WP. Tony (talk) 10:33, 24 April 2012 (UTC)
  - Yes, but Andrew, being the author of a book about Wikipedia, is seen much more as a Wikipedian and not a member of CREWE. It would be better if the article pointed out that a number of CREWE members that are specifically PR people were also upset with the study. Silver seren^C 16:15, 24 April 2012 (UTC)

Not commenting on the other points, but there seems to be a misconception that Signpost articles are encyclopedia articles - it's a news publication after all. Original reporting has always been encouraged (cf. Wikipedia:Wikipedia Signpost/About), so it doesn't make sense to invoke WP:OR. Regards, HaeB (talk) 11:18, 24 April 2012 (UTC)

Reporting news is one thing, but giving a POV piece is the same as an editorial or opinion piece and not an actual piece of news reporting. If it's meant to be an opinion piece, then fine, but it certainly doesn't fall under being a serious, neutral piece of news. Silver seren^C 16:15, 24 April 2012 (UTC)

FOR THE RECORD: There is no "membership" in CREWE per se. Anyone can join that Facebook group which is (nominally) a dialogue between Wikipedians and PR professionals. So to clearly state my reason for being there: I'm a Wikipedian in good standing since 2003, sysop, academic and author who is HIGHLY CRITICAL of PR professionals editing Wikipedia in any direct way. David Gerard (another longtime Wikipedian) and I are both prominent active voices of the "Wikipedia side" in that CREWE group. -- Fuzheado | Talk 18:00, 24 April 2012 (UTC)

I hate seeing things like "NPOV" showing up here. You guys do realize that this is not an article, don't you? Wikipedia's policies are for articles (aside from the ones that are specifically about users or non-article space, obviously). I see the Wikipedia acronym speak being thrown around in inappropriate places all the time (user talk pages and the VP being prime examples) and it bothers me every time. Pet peeve of mine, I guess.
— V = IR (Talk • Contribs) 22:32, 25 April 2012 (UTC)

Only 60%

Only 60% of our articles contain factual errors? This is excellent news and definitely something to celebrate. Much better than the 100% of PR press releases that contain factual errors. An excellent publication re enforcing our policy against COI editing. Now only if the community while get behind enforcing it.--Doc James (talk · contribs · email) 12:01, 24 April 2012 (UTC)

I concur with this interpretation. It would be just as truthful for anyone to say this study proves that 40% of our articles are 100% accurate. -- llywrch (talk) 16:04, 24 April 2012 (UTC)

I think its more a case that only 60% of a group of our critics consider there is an error in one or more business related articles where they have a conflict of interest. That tells us nothing about the error rate in Wikipedia generally. In the absence of any examples of actual errors it just leaves those of us who don't trust the PR industry with a bit more ammunition. There's also the issue of verifiability not truth, if a company PR person knows that their client has successfully covered something up then they could legitimately describe our article as inaccurate, even if it follows the equally inaccurate reliable sources. Ϣere SpielChequers 16:50, 24 April 2012 (UTC)

I think you're missing my sarcasm there, WSC. -- llywrch (talk) 16:57, 26 April 2012 (UTC)

I agree with Doc James. On reading about the article, my first thought was that this sounds like a calculated smear job on the part of the corporate PR industry to incite panic at Wikipedia and shift us into "omg! a crisis!" mode and potentially shift consensus away from the idea that COI is frequently a problem. On reading the ensuing discussion, my first thought was that it's working. Let's get real, people. Whatever the merits of the study (which imo sounds fatally flawed) or the Signpost coverage (which I thought did a pretty sound job of highlighting some of the flaws), we need to keep in mind that the overriding agenda within the corporate world is maximizing short-term profit, and an "any means necessary" approach is frequently used to achive that agenda. That routinely involves whitewashing a company's record. We should be highly skeptical of this study, and saying so to the world isn't overdefensiveness; it's common sense. Rivertorch (talk) 05:25, 27 April 2012 (UTC)

"Ya Got Trouble. Right here in River City! That starts with 'T', which rhymes with 'P', and that stands for PAY.Buster Seven Talk 12:39, 28 April 2012 (UTC)

How to lie with statistics

The part that is incredibly missing from this story, is that when you ask a flack who has seventeen clients, whether there are errors in any of their clients' articles, and they say "yes", they may only mean that one of the seventeen articles has errors. But this is reported as a yes, instead of as one-seventeenth of a yes; and the press picks up on this falsity and runs with it. It is hard to escape the suspicion that this bias was built into the study; I would love to be proven wrong. --Orange Mike | Talk 13:32, 24 April 2012 (UTC)

No you wouldn't :) But I concur. - Dank (push to talk) 16:56, 24 April 2012 (UTC)

"If you can't prove what you want to prove, demonstrate something else and pretend they are the same thing." (Darrell Huff, How to Lie with Statistics, Chapter 7: The Semiattached Figure) "60% of PR professionals expressing an opinion report errors" does not equate to "60% of articles contain errors". The sample is PR professionals, not articles. ~ Ningauble (talk) 20:47, 24 April 2012 (UTC)

To amplify on the bogosity: Orange Mike is correct about the problem of one respondent reviewing (or choosing among) multiple articles. There is also the problem of multiple respondents reviewing the same article, which is to be expected in an internet survey where colleagues recruit each other to participate. The data simply does not support any statement about proportions of articles, only about proportions of respondents. ~ Ningauble (talk) 17:42, 25 April 2012 (UTC)

No real data

The real trouble here is, there's no real data here—we know that a certain percentage of paid shills said they found an inaccuracy. Firstly, do they even know the information is inaccurate? Well, no one knows. What we'd need, if we wanted to study this, is something that asks not only "Do you think there's an inaccuracy?" but rather "What do you assert is inaccurate and why?". That way, we could do the following:

Determine if what's claimed to be inaccurate really is. If it's accurate but embarrassing, or the use of American rather than British spelling, it wouldn't be an error at all.
Determine if there is a reliable source cited for the claim. If so, any claim of "error" would be appropriately taken up with that source and requesting a correction, not with us.
Determine if an unreliable source was used as a basis for the information.
Determine if the "error" is actually something that can be reliably sourced but that the company disputes.
If there truly is an inaccuracy, from there, we'd want to know at least:
- How serious is the error? Is it a false claim that the company caused deaths/injuries (a very serious error), a missing zero in sales figures (a moderate error), or saying the company was founded in 1975 rather than 1976 (a trivial, harmless error)?
- How long has the error been present?
- Has anyone ever disputed or attempted to correct the error? If so, what was the outcome?

That would be real data. These are just meaningless numbers in a self-selected survey from a non-neutral source. Seraphimblade ^{Talk to me} 15:27, 24 April 2012 (UTC)

Again, another important point. A number of respondents believe that their Wikipedia articles have errors: after some research of this story, I could find no explanation what those claimed errors might be. I disagree with Seraphimblade, though, over what what defines a "serious error"; those S. lists are, IMHO, serious ones. Non-serious ones would include things such as outdated information or omitted details. -- llywrch (talk) 16:14, 24 April 2012 (UTC)

Well put, Seraphim. - Dank (push to talk) 16:54, 24 April 2012 (UTC)

The problem here is that it is Wikipedians who want verifiable "truth" in company articles. PR guys only need information useful to the current marketing campaign to appear. These are not the same aims and we shouldn't expect them to overly helpful to our goals. Rmhermen (talk) 17:34, 27 April 2012 (UTC)

You should not try to fight POV-pushing by POV-pushing

I tend to enjoy the high standards of objectivity and neutrality most Signpost articles live up to, and this one falls disappointingly short. It feels like a PR-spin itself. Even if the article is a blatant attempt at influencing policy discussions (I don't have an opinion on that as the link to it does not work, but the infographic certainly feels that way), that is no justification for giving up the norms of journalism and encyclopedic neutrality, and responding in kind. A few examples of what is wrong with the piece:

As already noted by others, it is a completely unrealistic assumption that for every "don't know" answer, the corresponding article is error free. Anyone who suggests the 41% as a better estimation to 60% is either uneducated in statistics (and thus probably not the right person to write an "investigative journalism" piece about a study which relies heavily on statistics) or dishonest. (And I won't even go into the ridiculous claims about a "statistical artefact".) DiStasio's assumption is that that the ratio of errors in the articles the responders have looked at is a good estimator for the ratio of errors in the articles the responders have not looked at; this is a completely natural assumption. It can be questioned (as any assumption) in which case you should say that the number of incorrect articles are between 41% and 72% (JN466 already did the calculations above).
One can of course always pull hypotheses about possible biases out of one's nose; just to prove the point, here are two which suggest the real number is above 60%: 1) larger companies have more PR staff so there is a larger chance they have one who does have time to read their Wikipedia entry; smaller, less notable companies are probably overrepresented in the don't knows, and it is well known that Wikipedia articles about less notable subjects tend to have more errors than ones about more notable subjects; 2) those PR professionals who found factual errors in their articles probably warned the editors, and (unless one has very unfavorable views of Wikipedia editors) it is reasonable to assume that most of those were corrected - so the ratio of errors should be much lower for those articles which have been examined by the respondents than the articles belonging to the don't knows.
The point is, you can make any number of claims about why some subgroup of the respondents (or even the whole group) are biased this way or that, and thus how the 60% figure should be adjusted up or down, but those adjustments will be actually less natural than not assuming biases (and arriving at 60%), not more. One could make a compelling argument that treating the survey results as unbiased is not justified, and the results should be treated as unreliable (amongst other things, the sample was self-selected, and most participants probably have an interest a result which is unfavorable for Wikipedia); instead of that, the Signpost article tries to counter with assumptions which are even more unjustified, while at the same time crying foul play.
More importantly, the article is grasping at straws, pretending that the difference between 60% and 40% is important for interpreting the results. Of course it is not; whether every two out of five or every three out of five articles are wrong is not a huge difference. There is nothing that makes being above or below 50% magically different, even if it might give an opportunity to nitpick on expressions such as "majority of articles". If you think that 60% of the articles containing errors is disconcerting, you should be worried about 40% too. So the whole article is basically a huge red herring, talking about anything but the significance of the results.
Which is especially disappointing because the significance of the results is not at all clear. Either the report about the research is missing all the interesting parts, or the research itself has big omissions (in which case there would be a good opportunity to do some actual "investigative journalism") - how many of the (assumed) errors were big ones, potentially influencing the public opinion of the company, and how many insignificant? How many of respondents reported the errors they found? How many managed to get some or all of the assumed errors corrected? Such details would have been much more informative than the huge fuss about percentages and the barrage of quotes (most if which are opinions, strongly worded, but mostly contentless).

--Tgr (talk) 17:04, 24 April 2012 (UTC)

I shan't comment on the specific of your post, which I'm sure are interesting and pertinent, but I would merely take issue with your title: the Signpost isn't trying to do anything. It's decentralised. Authors operate a great deal of autonomy: the Editor-in-Chief (who comes closest to the embodiment of the Signpost) simply publishes any interesting articles those authors may care to write. - Jarry1250 ^{[Deliberation needed]} 17:14, 24 April 2012 (UTC)

Well that's how newspapers work. Readers don't really care about the internal processes; the result is attributed to the media outlet and not the authors (it is different with opinion pieces, but this article was titled a report). Just read the first paragraph of the article for examples: "picked up by ABC news", "Business2Community went so far as to announce" etc. No one cares who exactly at Business2Community did the announcing. It is nice that authors are autonomous, but some sort of collaborative quality assurance does go on behind that, I assume?

Anyway, I changed the title in case it was insulting for other Signpost authors. --Tgr (talk) 17:36, 24 April 2012 (UTC)

Thank you for changing your title. Unfortunately, much as the Signpost would like to be up there with ABC news etc., as volunteers we do not have the resources to invest in collective responsibility, insofar as that is not unavoidable. You are more than welcome, of course, to make a criticism alleging poor editorial selection, if you wish -- but of course that is not the same thing as suggesting that the newspaper's view was wrong. Regards, - Jarry1250 ^{[Deliberation needed]} 17:42, 24 April 2012 (UTC)

RE:the comment "Anyone who suggests the 41% as a better estimation to 60% is either uneducated in statistics ... or dishonest." I am neither uneducated in statistics, nor dishonest. User:HaeB, who was quoted in the article, is certainly educated in statistics and I must assume that he is honest (since I've never run into him before). This is not about credentials however. Anybody who has had more than one university-level stats course (or equivalent) should recognize that for many reasons the 60% number is simply meaningless. I believe that nobody with a knowledge of stats will take the results of this study seriously. Smallbones (talk) 18:04, 24 April 2012 (UTC)

The 60% number is problematic, as you say, but the 41% is even more problematic. The article would be better served by pointing out all the problems inherent in the survey instead of taking its results and trying to wrangle out (in a very questionable way) a different percentage. --Tgr (talk) 18:33, 24 April 2012 (UTC)

Tgr, there is no one "who suggests the 41% as a better estimation to 60%" for the ratio of company articles which contain errors - at least I didn't, please read the article and the linked tweet more carefully. As discussed at length above, the 41% number was brought up (in form of the statement that in the survey "41% of PR professionals with Wikipedia article on their company/client say it contains errors", whose factual accuracy hasn't been contested) to show the flaws in DiStaso's approach (inasmuch as it is meant to determine that ratio), by indicating that replacing DiStaso's cavalier assumption with another similarly cavalier assumption would yield a significantly different number. In other words, it was used to do exactly what you suggest, namely to point out "problems inherent in the survey". Regards, HaeB (talk) 00:29, 25 April 2012 (UTC)

But although it's valid to exclude respondents whose companies have no article (the last bullet), excluding the don't knows, which boosts the factual error rate to 60% (406 / [406 + 273]) raises difficult issues. Including the don't knows would yield 41% (406 / [406 + 310 + 273]). This problematic calculation was independently pointed out by Tilman Bayer (HaeB) of Wikimedia's communications team, who has postgraduate degrees in mathematics and co-edits the Wikimedia Research Newsletter. The true percentage is almost certainly not 60.

I agree with Tgr. The way that argument is presented in the article, with the implication that 41% would be a more accurate value, is not mathematically sound. J N 466 13:23, 25 April 2012 (UTC)

HaeB, the article suggests that 60% is somehow artificial and even manipulative ("This cleverly allows DiStaso to exclude...", "boosts the factual error rate"), while 41% is presented without any such concerns. This is simply untrue: ignoring the "don't know" answers is a completely natural assumption, which is regularly used in statistics (leaving out uncertain voters in election opinion polls, for example). Assuming all "don't know" answers are in reality "no error" is, on the other hand, completely artificial. Even if the article is merely equating the two practices, as you suggest, that's still severely misleading. DiStaso was justified to use that number, there is nothing unethical or unprofessional in it, and there is no good reason to suspect the real number would be lower. (What was not at all justified is pretending that number is the percentage of Wikipedia articles, when it is in reality the percentage of PR people, and given that they were notified of the survey through channels which strongly favored those who had conflicts with Wikipedia, it is very likely that problematic articles are strongly overrepresented. The whole "don't knows" issue is just a red herring, as I said.) --Tgr (talk) 07:44, 26 April 2012 (UTC)

"Including the don't knows would yield 41% (406 / [406 + 310 + 273])."—that's all it says; would. I don't know where JN466 gets the idea that the story, as published, privileges any number. The point made is that the blanket assertion of 60% by DiStaso is not defensible.

Tony (talk)

09:46, 26 April 2012 (UTC)

It privileges one end of a distribution. Including the don't knows does not yield 41%; it yields the 41%-72% range. You get 41% by including the don't knows and assuming they all have error-free articles; the article conveniently does not mention that (rather unlikely) assumption. And, as I tried to argue in some length above, "the blanket assertion of 60%" is completely defensible (well, probably not when it is asserted to be the ratio of all Wikipedia company articles which have errors, but that is for different reasons). Here is a random study about Wikipedia, for example, which does the same; the press release says 45 percent of toxicologists find Wikipedia accurate; if you read the article, you will find that the actual number is 23%, with 54% having no opinion. Reliability of Wikipedia reports the 45% number; so did Signpost. Do you think this was "not defensible", too?

I'm fairly sure if you checked more Wikipedia research, you would see the same treatment of "don't know"/"not sure" answers; it is the common thing to do. I'm completely sure you will not find a single study where "don't know" answers are treated the same way as "accurate" answers. Sorry, but the Signpost article is way more dishonest with the numbers than the study it is trying to criticize. --Tgr (talk) 12:08, 26 April 2012 (UTC)

I really don't like being called "dishonest" even if only indirectly. As far as the other study cited, it's a completely different methodology that compares different sources' general accuracy, without attempting to count errors or "inaccurate articles." It has two sides that are being considered "overstated" and "understated" as well as "accurate". And Wikipedia is considered to be one of the most accurate sources, no matter which measure you use. DiStaso could learn a lot by studying this methodology. It's very important in this discussion to remember that the type of error, including errors of omission, and bias are extremely important. Frankly, I have to say that 100% of Wikipedia articles have some type of error, but I'd say the same about Encyclopedia Britannica, and even the most highly respected journal articles (and I hope nobody misquotes me on this!) Until we have a direct pipeline to the "truth," any study of accuracy has to be a study of comparative or relative accuracy. Ultimately there is no verifiable measure of "absolute accuracy" any more than of "absolute truth." Smallbones (talk) 16:22, 26 April 2012 (UTC)

I am missing the Facebook Like button here. :) Tgr made a typo (the actual figure that underlay the 45% in the Signpost report was 21%, not 23% – see last page of the original document), but apart from that hit the nail squarely on the head, and gave a beautiful example. The toxicology study is the exact same thing. The Signpost was happy to have don't knows discounted in the toxicology study when the resulting figure was in Wikipedia's favour, accepting the study authors' summary that 45% of toxicologists considered Wikipedia accurate, when in fact it was only 21% of respondents, due to the large number of don't knows. But the same standard procedure in DiStaso's study is pilloried as dishonest. That's just not good behaviour, and misleads readers. Never let fear cloud your view of the data. --J N 466 16:17, 27 April 2012 (UTC)

Independent research needed

The frustrating thing is that we don't seem to have a recent reputable study into the accuracy of the pedia. Without that we are vulnerable to this sort of exercise. A more open response to this would be for the WMF to commission a trustworthy third party to quality check a random set of facts and articles and produce a report on it. If this was done as an annual or even biannual exercise then the press would have something to check against, we would have an interesting benchmark, and if and when "studies" like this emerged the press could ask the researchers would done the benchmark study to comment on the competing study. Ϣere SpielChequers 17:16, 24 April 2012 (UTC)

I'm completely agnostic about whether this would be a good use of WMF funds or not, but perhaps, for reasons of perceived bias, it would be better for independent researchers to do it.

If it were to be done, following the suggestions in the section above "No real data given" would be a good start. Graduate students would likely do the actual error evaluation, and they could not be expected to research each error on the spot. Rather they would have to be given a detailed instruction sheet on what constitutes a major error, and minor error, etc. Given all the fuss about "errors of omission" above there should also be questions regarding omitted material and neutrality of presentation. While those aren't black and white calls, mechanical instructions could at least lead to fairly consistent results. The evaluators and the instructions would have to first be evaluated to see if there were internally consistent results.
This type of test wouldn't be very meaningful for the full set of Wikipedia articles. Rather subsets should be separately examined, e.g. articles on Fortune 500 businesses, social science, mathematics, or even popular culture.
These results by themselves wouldn't be very meaningful - perhaps we could use them comparing 2 time periods to see if accuracy has changed over time. But there will be possible perceived error in any material - the real question is how does the accuracy compare to something else, e.g. to Encyclopedia Britannica, or maybe corporate websites or annual reports. The comparison would have to be something that has similar goals to an encyclopedia. Comparing a Wikipedia article to a sales brochure would be a meaningless exercise.

So this would be a challenging and expensive task. Not one to be taken lightly. Smallbones (talk) 19:02, 24 April 2012 (UTC)

I agree that it isn't to be taken lightly, this would be a big investment, but I think a worthwhile one - even if the results aren't as positive as some of us expect. However I'm not convinced that we currently have a competitor worth benchmarking against. If we used the Britannica I fear that we'd get a false confidence as the articles that have analogues on the Britannica will be skewed to our higher quality content. The Britannica is much smaller than we are and perforce their notability criteria are more stringent. Since we rely on crowd sourcing, those articles which are less notable will tend to have fewer readers and generally fewer editors. Better in my view to have a rigourous process based on random sampling - even though each year's sample will of necessity be different as I can't imagine we could identify a bunch of errors without fixing them. Ϣere SpielChequers 17:09, 25 April 2012 (UTC)

The Foundation has initiated exactly that, see [2]. I have been following this kind of research for several years now, and while there are actually quite a few systematic studies examining Wikipedia's accuracy by now (see e.g. those listed under reliability of Wikipedia#Comparative studies, or in these slides of a talk I gave in January), it is obvious that it is hard to do such studies on a big scale, if one aims at maintaining good research standards - a solid evaluation of factual accuracy is a lot of work (to quote from a review I wrote half a year ago: "[The paper] first gives an overview of existing literature about the information quality of Wikipedia, and of encyclopedias in general, identifying four main criteria that several pre-2002 works about the quality of reference works agreed on. Interestingly, 'accuracy' was not among them, an omission explained by the authors by the difficulty of fact-checking an entire encyclopedia.") This is for the most infuriating aspect of the DiStaso's press release claims - pretending to have done a solid evaluation of the factual accuracy of many hundred Wikipedia articles, where more conscientious researchers have had to content themselves with much smaller numbers and less far reaching claims.

By the way, DiStaso's article says (p.2) that a study "to analyze the accuracy of corporate articles ... is underway as a CREWE initiative".

Regards, HaeB (talk) 00:53, 25 April 2012 (UTC)

The Oxford/Epic study design looks awesome. Are they on schedule? There shouldn't be any perceived bias issues there. The reliability of Wikipedia#Comparative studies, and these slides of a talk YOU gave in January are also quite informative. What strikes me first is the small sample sizes - which result because of the detailed work that needs to be done. But the "specific fact - large N" approach (did Wikipedia get all 280 US gubernatorial candidates in a given period?) looks promising. All the best. Smallbones (talk) 03:30, 25 April 2012 (UTC)

PRSA has changed/updated their headline

Arthur Yann of PRSA said in the CREWE Facebook group:

PRSA acknowledges that the headline of its news release announcing the publication of Marcia DiStaso’s research study in PR Journal oversimplified the study’s results. To help prevent any further misinterpretation of the findings that our release may have caused, we have updated its headline, subhead and lead. In doing so, we hope the focus of this discussion can remain on the gap that exists between public relations professionals and Wikipedia concerning the proper protocol for editing entries, and the profession’s desire for clear, consistent rules that will ease the learning process for public relations professionals of how factual corrections can properly be made.”

WAS:

April 17, 2012

Survey Finds Majority of Wikipedia Entries Contain Factual Errors

Public relations professionals cite issues with Wikipedia’s accuracy and editing process

http://media.prsa.org/article_display.cfm?article_id=2575

NOW:

April 24, 2012

Wikipedia & Public Relations: Survey Shows Gaps in Accuracy, Understanding (UPDATED*)

60% of respondents say articles on their companies, clients contained factual errors; 25% say they are unfamiliar with such articles

http://media.prsa.org/article_display.cfm?article_id=2582

-- Fuzheado | Talk 19:15, 24 April 2012 (UTC)

This has gone beyond the point where quibbling over mere words makes sense, but the numbers still don't add up. I could live with "41% of respondents say articles on their companies, clients contain factual errors, 31% say they are unfamiliar with such articles, and 28% can't find any errors" These numbers come from the same question and are calculated as 406/989, 310/989, and 273/989. The headline's numbers appear to be calculated as 406/679 and 310/1260. Switching the denominators like this just doesn't make any sense. For example it could be inferred from the headline's numbers that the % who can't find errors is 100% - 60% - 25% = 15%. It just ain't right. Smallbones (talk) 20:24, 24 April 2012 (UTC)

You have a point. --J N 466 13:29, 25 April 2012 (UTC)

Thank you!

There are lots of people on the in the corporate and PR community, lobbyists and others who would like to use Wikipedia for promotion. I cannot imagine a stronger COI than a person who is paid to make their client look good on the internet (or a person or company editing an article about him, her or itself). For an academic to put together a biased opinion poll of these people with strong COIs, and then to publish their poll answers as if what they said is somehow objective is astonishing. Thanks, Signpost, for alerting us to this travesty. The proof of the pudding is in the eating: What this "study" led to is asinine headlines like this: NYDailyNews: "Wikipedia entries full of factual errors". No, what the "study" found is that 60% of paid shills who were asked ambiguous questions, if they had any opinion at all, felt that one or more articles on their clients had an error, including, possibly, a spelling error. -- Ssilvers (talk) 01:23, 25 April 2012 (UTC)

According to the report 82 "errors" were spelling, though without knowing the alleged errors we don't know whether these are genuine typos or simply examples of us using a different variety of English. A further 152 were "leadership or board information" but this wasn't subdivided into examples where we are incomplete, out of date, or have information which was never correct. Ϣere SpielChequers 23:14, 25 April 2012 (UTC)

What the report tells me

The report is intended to convince Wikipedians to openly allow PR people, but the report itself seems to demonstrate a laundry list of reasons NOT to. It's not in encyclopedic tone, doesn't represent all majority and minority viewpoints, uses misinformation to support an agenda and so on. It even demonstrates an ability to corrupt the balance of trusted sources from the real-world equivalent of the Talk page and create one-sided stories in independent sources through the availability of resources.

In other words, editors like Smallbones weren't given a voice in these media articles, because he doesn't have a PR person pitching him to the media. Data to support their POV was presented, but what about data like this[3] showing the edit histories associated with the top ten PR agencies by revenue. If the same behavior and dynamics we see with the report were brought to Wikipedia, it would certainly be a bad thing for the pedia, more so than factual errors.

While I don't believe this to actually be the case, the report seems to communicate to me a need to outright ban PR people. Additionally, I find it difficult for anyone who cares about Wikipedia to consider an open collaboration with a group that publicly assaults the website's credibility in such a manner. All I can do is invite PRSA/IPR/etc. to humble themselves and commit to learn how to meet Wikipedia's content needs and collaboration style, but I don't expect such an invitation to be met. User:King4057 (COI Disclosure on User Page) 01:51, 25 April 2012 (UTC)

I first took "The report" to mean this "Investigative report". Were you referring to the DiStaso journal article?

Tony (talk)

02:02, 25 April 2012 (UTC)

A practical response to PR complaints

Let's move on from the hyped and flakey 60% claim. Whether the current low-tolerance policy remains—which looks likely for the time being—or whether it's loosened, it's hard to ignore the perceptions among PR and communications professionals of long waits or no response at all to open requests for changes to articles on companies. These are the good guys, the ones who do the right thing by asking for editorial mediation; yet the message is that they're routinely discouraged. Perhaps this is a collision between the volunteer culture on the foundation's sites ("there's no deadline") and the rigours of turbo-charged capitalism, where I tend to agree with DiStaso's point that five days is a long time for professionals and their clients to sit in silence ("is anyone at home?"). Yet volunteers appear to have done reasonably well in managing serious and complex issues such as quick action on copyright and BLP issues: we've shown that dynamic management is possible, and isn't it part of the cost of doing business on a big, powerful wiki?

Personally, I've found it difficult and time-consuming to navigate through the maze of CoI-related pages on the English WP. Some are tired, moribund, or confused, bloat abounds, and there seems to be no centre of gravity. No wonder a lot of PR professionals and company reps throw up their hands and edit under the radar, when the radar resembles a low-wattage flickering street light in bad weather.

The head of communications, Jay Walsh, sees the response problem and has acknowledged it publicly, as reported at the end of the story. So leadership is in order from the foundation—the cross-wiki implications alone suggest that it's a matter in which the foundation should take a more active, practical role: god knows what tangled webs or straight-out neglect are the norm on the other 280 WPs (including the smaller, outlying language WPs, largely impenetrable to the movement).

If it's good enough for the foundation to create a summer fellowship to revamp our help pages (see the Signpost report this week), it's good enough to consider employing a fellow to work with the community to revamp the speed and efficiency with which we respond to PR requests and queries—to see things from the perspective of incoming PR professionals and to create an easy system to tempt them away from subterfuge. Good openers would be to create a template for posting on company-article talk pages with a link to a convenient, one-stop noticeboard, and working out how to attract volunteers into a response team that involves personal stimulation and social reward. And there's the possibility of sending pro-active messages out to the PR/communications/corporate community about working with them to ensure balance and neutrality; that would be good for the movement's image, wouldn't it. $Tony (talk)$ 04:14, 25 April 2012 (UTC)

I've made similar proposals on the CREWE Facebook page, but using a community-based rather than Foundation-led process:

collaborative creation of a guideline or policy for company articles that defines
1. what sort of information self-disclosed PR professionals are not just welcome, but requested to add and keep up to date, based on company sources – things like the name of the current CEO, location of the company headquarters, officially reported financial figures etc.
2. what sort of information generally may be added based on primary sources, and what requires secondary sources (court cases for example should require secondary sources, as in BLPs)
3. general content expectations, i.e. what any article on a company should contain
4. guidelines on neutrality, balance, coatracks, attack pages
institution of a noticeboard where PR professionals can flag articles that have gone wrong, and help to work out fixes, to then be implemented by another Wikipedian

Having said that, Foundation support would of course be welcome. --J N 466 12:55, 25 April 2012 (UTC)

Whether the current policy "remains or is loosened" leaves out a third option. If this is how the PR industry behaves then perhaps we should tighten and more strongly enforce policies on COI? Ϣere SpielChequers 23:18, 25 April 2012 (UTC)

The comments apply even if the guidelines are strengthened.

Tony (talk)

01:18, 26 April 2012 (UTC)

Lose the drama, read the study

The DiStasso paper is publicly available: http://www.prsa.org/intelligence/prjournal

It is not hard reading.

There are certainly a couple of structural problems with the survey, nicely pointed out above: (1) There is no quantification of the magnitude of error, minor errors and major catastrophes are both considered the same; (2) Respondents were not asked to answer about a single client, so some may be venting about one client and being counted for it, but having no problems with other pages and not having those "good" pages tallied; (3) The paper pretends there is something called a "bright line" policy about paid COI editing and spends a lot of time studying respondent understanding of this incorrect interpretation of actual WP policy.

There were also tactical errors: (1) It was a mistake to try to come up with a sensational high error number and to make that the hook of the piece. The takeaway should be "Most PR people who deal with clients that have Wikipedia pages feel that there are significant errors on those pages, and they are confused about Wikipedia's practices for getting those corrected." Instead we've got a bunch of people yelling about whether 41% or 60% are more accurate quantifications of the problem; (2) It was a very big mistake taking the results to the press and trying to make a news story out of it, rather than quietly bringing the findings to WP directly. Bad blood resulted.

We've just had an RfC on COI editing, now running out of gas. As one might have predicted, opinions vary widely and there is no consensus for any approach to clarification of the matter. What's pretty clear is that as long as there are pages about large corporations on Wikipedia, there will be paid PR people with a professional interest in making sure that those Wikipedia pages are fair, neutral, and error free. That does not describe the current state of many of these pages, I think we all can agree — whether 41% are screwed up, or 21%, or 60%, or some other is absolutely irrelevant. The fact is that there is a problem of some magnitude. How this is resolved is ultimately up to us as a community.

I am very disappointed in this piece, my comments above were written about a late draft, which changed little. It is not journalism, it is an opinion piece disguised as journalism, and a very one-sided and shrill piece of work. Done's done. The issue isn't going to go away. I just urge people to actually read the report and to see what it says and what it does not say directly before they fly off the handle being all too sure about how to resolve a complex problem. Carrite (talk) 04:14, 25 April 2012 (UTC)

Wait we are meant to think that that headline was an ah "tactical error"? Seriously? No it was moderately competent PR and blatant statistical abuse to the extent that whoever did it can lacks intellectual integrity to the point where there is little reason to further consider any of their claims.©Geni 09:49, 25 April 2012 (UTC)

Have you read the study from beginning to end, or just this hatchet job? J N 466 12:46, 25 April 2012 (UTC)

I, for one, have read it from beginning to end; and I do understand what things like chi square mean. I still found the big fat lie, "In other words, 60% of the Wikipedia articles for respondents who were familiar with their company or recent client’s article contained factual errors" right there in the text. Who passed this person's Ph.D. thesis at the U. of Miami? Was the statistics work that shoddy in said thesis? Are sentences like, "Also, by disallowing public relations/communications professionals to make edits while allowing competitors, activists and anyone else who wants to chime in, is simply asking of misinformation." [sic] considered acceptable English by Public Relations Journal or the College of Communications at Pennsylvania State University???? --Orange Mike | Talk 14:04, 25 April 2012 (UTC)

Did you take similar issue when the Signpost reported that 45% of toxicologists found Wikipedia accurate? Would you insist that that figure should be revised down to 21%? For further background, see Tgr's post above, 12:08, 26 April 2012 (UTC). Same maths. Goose, gander. Will you be going to Reliability_of_wikipedia#Expert_opinion and rewrite the Science and medicine section to revise the percentage of toxicologists down from 45% to 21%? Because you cannot with intellectual honesty maintain the complaint you are making here while letting those 45% stand. J N 466 21:54, 27 April 2012 (UTC)

I've also read the report, and even looked up the study she cites for her figure that only 23% of Wikipedians have college degrees. Checking Glott, Schmidt &Ghosh (2010) they actually said that in their sample 49% of Wikipedia contributors are graduates of whom 23% have a masters or a PhD. Now perhaps College degree is an American English term for Masters and PhDs that I wasn't aware of, but unless that's the case this seems like an error that makes Wikipedians in general look rather less educated than the study indicated. I haven't checked everything in her report that triggered my bullshit detector, the bit about us having "more than 82 hundred contributors" is of course technically correct, as it would be to say more than 82,000 or indeed more than 82 contributors. But it is misleading, http://stats.wikimedia.org/EN/TablesWikipediaZZ.htm shows that Wikipedia has had nearly 1.5 million contributors who've made over ten edits each, and the figure for the English language wikipedia alone is over three quarters of a million, with over thirty thousand making more than 5 edits a month. Combined with the discredited 60% stat I think I might be detecting a pattern here. Ϣere SpielChequers 22:52, 25 April 2012 (UTC)

And, just to prove how easy it is to argue and fight about numbers rather than actually discuss the underlying problems, I'll note that you aren't talking about 1.5M contributors at all, you're talking about 1.5M usernames, which includes multiple names for single individuals and quite possibly — for all I or you know — every single IP address for dynamic IP users. Bottom line: we can carp about numbers all day. That's not the issue here, the question is whether a deeply divided WP community can get together well enough to come up with a set of mutually satisfactory "best practices" for COI editors that addresses the needs of both The Project and the PR pros seeking to make sure that WP delivers on its claim of truthful NPOV coverage of their clients. I'm starting to think that we'll chase our tails for five years on this without any progress forward... Carrite (talk) 06:16, 26 April 2012 (UTC) Last edit: Carrite (talk) 06:18, 26 April 2012 (UTC)

The difficulty is that I think everyone acknowledges that WP has problems with errors in company articles. However, part of that process is to understand the nature and scope of the problem - if research being presented is flawed, we can't use that to understand the issues. Errors in a peer-reviewed paper raise flags which risk hiding the genuinely valuable points in the paper, which is what I feel happened here. That doesn't mean we shouldn't be addressing the problem, but that this paper probably isn't the best means of raising it. I gather that there is some good research underway looking at the Fortune 500 companies, and that will be interesting to see. - Bilby (talk) 07:36, 26 April 2012 (UTC)

@Carrite, Well I know it doesn't include IPs because I chose a stat that excluded IP contributors altogether - as well as any account with fewer than ten edits. Yes there will be humans with multiple Socks in that list, but there are plenty of contributors who are IP editors or have fewer than ten edits, the key point is that "82 hundred contributors" is wildly out.

@Bilby. In the absence of any examples of these alleged errors I for one am not yet convinced that the error rate in company articles is any different to the general Wikipedia error rate. My suspicion is that our concept of an error is so different to those of PR writers that the error rate is not something we will be able to agree on, it must be very difficult for an article on a company to be simultaneously accurate both by the standards of Wikipedia and in the minds of those paid to spin for that company. Ϣere SpielChequers 09:14, 26 April 2012 (UTC)

Sorry, my assumption without any other data was that the error rate would be the same as on other pages. Hence there would be an error rate - that was what I was assuming was a generally acknowledged issue, not that company articles were especially problematic. :) My concern is that we need solid research looking at this particular domain, and that the red flags with this paper lead me to want to look elsewhere for that data. And I agree with the problem of identifying those errors through PR people who represent the companies - that's one of the red flags. - Bilby (talk) 09:22, 26 April 2012 (UTC)

RE: "I for one am not yet convinced that the error rate in company articles is any different to the general Wikipedia error rate." NOW we get to the point of the entire exercise — what the PR people are (clumsily) attempting to express is that in their view there is a significant (perhaps massive) problem with the content of business pages as compared with "average content" on WP. (I think they are right.) There is also study going on about how highly WP page results figure on a Google search — showing up on the front results page, even high on the front results page, for very major corporate entities. In short, there is a sense by the "designated representatives" of those entities that there is a fairly huge problem here.

Maybe 6 months ago I was of the belief that we should be hunting down and wiping out all paid editors/PR people editing at WP and backtracking to neutralize or eliminate their edits. I was as hardline as anybody on this. I've come to a new understanding though, one that I think most hardline opponents eventually will come to once they really start to ponder the reality of the situation. There is ALWAYS going to be pressure over content as long as (a) Wikipedia remains important; (b) big corporations remain big. That is to say: there is ALWAYS going to be such pressure... We all know what bad, horrible spammy pages look like. Somehow in our minds' eye we think that these are the product of PR flacks, doing their dirty work. And some of them are. More, however, fly under the radar — because they are TRYING to stay within our rules and to produce neutral and encyclopedic content.

There needs to be a formal set of "best practices" for these Under The Radar Because They Are Doing It Right editors — in the hopes that the Doing It Wrong PR types will join them in doing it right. Just arguing as the Co-Founder does — "don't do it because don't do it" — is not a tenable situation because there are no extant reliable mechanisms for necessary changes to be rendered. We work on the principle of BE BOLD at WP and that will always be the best way for changes to be rendered. Failing that, there needs to be something else that actually works. Suggesting changes on talk pages is like writing a note, stuffing it in a bottle, and throwing it in the ocean. We can't even keep up with complaints filed on BLP pages... The solution realistically needs to involve direct editing by interested parties. But under what parameters? That's the question.

I think the hardliners seeking to ban off paid editors are in a substantial minority, although recent comments by Jimmy Wales indicate he thinks otherwise. That needs to be the first fundamental decision made: are we going to attempt editor-based or editing-based guidelines to COI editing? If the former, where are those lines to be drawn — what precisely constitutes bannable COI and how are we to make this determination in an encyclopedia in which no formal registration and signing in is required to edit? If the latter, what exactly do we expect paid editors to do to meet our highest expectations?

Everything flows from that fundamental question. The recent RFC on conflict of interest editing was a mess. ArbCom needs to start another, proper RFC consisting of one question: "Is paid COI editing to be bannable on a per se basis?" If yes, that line needs to be carefully drawn — who exactly is a "paid employee" with a COI to be banned if caught? (A teacher writing about their school? A factory worker about the corporation for which they work? A grocery store owner about a product which they sell? A minister writing about the history of his church?)

If we decide to focus upon the edits and not the editor — I believe this to be the majority view, judging by opinions expressed in the failed RFC — then what EXACTLY do we expect paid COI editors to do or not do?

Anyone who thinks this is a simple issue is wrong. Sorry for using so much space. Carrite (talk) 20:36, 26 April 2012 (UTC) Last edit: Carrite (talk) 20:59, 26 April 2012 (UTC)

Utter nonsense

It is impossible to give a sensible critique of, or response to, something that is utter nonsense to begin with.

We start with a survey—that is, an opinion poll—about whether Wikipedia's articles about companies each contain at least one factual error. There is no necessary correlation between opinion and fact. To put it differently, an individual's opinion about a fact may say something about the individual, but it says nothing about the fact. Suppose one were to conduct an opinion poll of a random sample of American adults on this question: Is there a factual error in Wikipedia's List of Presidents of the United States? The results, whatever they might be, would say nothing about the list's accuracy. Likewise, the results of this survey say nothing about the factual accuracy of Wikipedia's articles about companies. Indeed, if 100% of the survey respondents had said that there was at least one factual error in the article about the respondent's company or PR client, that would not prove any error in any of the articles. Criticizing the statistical methodology or conclusions is beside the point, since the individual survey responses are worthless to begin with.

This otherwise meaningless survey's respondents all have conflicts of interests, which makes their responses—that is, their professed opinions—even less meaningful than would be the respopnses of unbiasred responsents. It was in their respective interests for the survey to "prove" that Wikipedia's articles about companies were inaccurate, so these self-interested respondents would be allowed a freer hand in determining content of articles about which they have a self-interested point of view. Indeed, the respondents were chosen because of their conflct of interest. Normal survey design neutralizes bias; this survey guaranties that 100% of the respondents are biased.

At the other end, suppose the statistically flawed conclusion drawn from the meaningless, biased data were true. Suppose 60% of Wikipedia's articles about companies contain at least one factual error. What would that prove about the articles' factual accuracy? Almost nothing.
- It says nothing about the ratio of inaccurate to accurate facts. Surely, the error rate does not approach 60%, as the "headline" might imply to the average (careless, innumerate) reader. I would be concerned it the error rate approached 5%, or even 3%. However, given the number of facts in typical articles about companies, the survey's conclusion (even if taken seriously) does not imply an error rate that approaches ½% (by my utterly baseless guesstimate).
- It says nothing about the importance of any errors. At one exterme, an article might place a company in the wrong industry or misstate its annual revenue by a factor of 100; I doubt that Wikipedia has any errors of that magnitude, but almost anything is possible. At the other exterme, an article might have the wrong middle initial for some individual.
- It says nothing about the source or nature of any inaccuracy. If an incorrect fact is based on a reliable source, and no reliable source has the correct fact, there is little that Wikipedia can, or should, do.

While this piece is a noble effort to put the survey in some perspective, its flaw is treating the the survey as though it means anything in the first place.—Finell 18:35, 28 April 2012 (UTC)

BizProd

This article has prompted me to make a proposal at Wikipedia:Village_pump_(policy)#BizProd. Ϣere SpielChequers 16:31, 29 April 2012 (UTC)

News and notes: Help-space revamp, WikiTravel RfC, and Justin Knapp scores a million edits (533 bytes · 💬)

Just a minor thing. The We Can Edit poster wasn't specifically created for the WikiWomen's History month. I hastily 'shopped it together as a general propaganda poster for any and all getting-more-women-to-edit-Wikipedia activity and other countering systemic bias type stuff. —Tom Morris (talk) 13:14, 24 April 2012 (UTC)

Technology report: Wikimedia Labs: soon to be at the cutting edge of MediaWiki development? (3,035 bytes · 💬)

Just to say that because of publication deadlines, fallout from the 1.20wmf1 deployment to en.wp will be covered in next week's issue. Thanks! - Jarry1250 ^{[Deliberation needed]} 00:08, 24 April 2012 (UTC)

And just because it's never being mentioned, you can try out MathJax here for more than two years now using the mathJax user script. Nageh (talk) 10:28, 24 April 2012 (UTC)
- I did actually intend to give you a little nod for that in the piece; now clarified accordingly. Regards, - Jarry1250 ^{[Deliberation needed]} 10:32, 24 April 2012 (UTC)
  - Thanks, very much appreciated! Nageh (talk) 10:37, 24 April 2012 (UTC)

Diff colors

Can we please have brighter diff colors? I can barely see these. For a small change like punctuation, it is very difficult to find the change. I know that most Wikipedians have young eyes, but have some compassion for us old folks! -- Ssilvers (talk) 01:55, 25 April 2012 (UTC)

Actually, the new version can be easier to make out - see this diff and compare it replicated on a wiki which is currently MW 1.17, here. See the first paragraph of the diff on each - the new one highlights the differences rather than using red font, which can be easier to spot when the differences are in a single character, or punctuation (especially when a whitespace character is added or removed).

However, I'd also like a stronger color of highlighting. I usually have no trouble reading (I'm not quite an old fella yet) but brighter colors would make the diffs easier to make out.

Other than that the new diffs look good. Unfortunately they still sometimes struggle to compare equivalent paragraphs, when new paragraphs are added. That looks like a tricky problem - not sure what it would take to fix that. --Chriswaterguy talk 07:30, 25 April 2012 (UTC)

A new diff engine. They are available, but I think we use the one we do for efficiency purposes.

Incidentally, I'm sure the coloured border used to be wider - it must have been thinned for other reasons - which made the requisite paragraphs stand out more. Presumably the problem with increasing the vividness of the colours is the contrast with the black, though I can't say for sure. - Jarry1250 ^{[Deliberation needed]} 09:32, 25 April 2012 (UTC)

**WikiProject report: Skeptics and Believers: WikiProject The X-Files (744 bytes · 💬)**

Image

A couple corrections. The image labelled "Gillian Armstrong plays the character Dana Sculley in The X-Files" is not correct. First, the actress' name is Gillian Anderson. Second, the image isn't of her, but of someone cosplaying Scully. — The Hand That Feeds You:^Bite 12:02, 24 April 2012 (UTC)

I have boldly corrected the caption.--ukexpat (talk) 13:13, 24 April 2012 (UTC)