Jump to content

User:Jorge Stolfi/DoW/Intro

From Wikipedia, the free encyclopedia

Wikipedia is dying: Why, and What can be done about it

[edit]

Wikipedia is dying

[edit]
linear scale log scale
Wikipedia's growth rate N'(t) (new articles in each 4-week period). The dots are measured data, obtained by interpolating, resampling and differentiating the official article count rate. The solid line is a two-phase exponential model with a 1-year arbitrarily smoothed transition.

Many statistics, such as the monthly new-article rate (plotted above), indicate that the activitiy of Wikipedia editors underwent a radical change around 2006: from a steady exponential growth (with most activity indicators generally doubling every year) to a steady exponential decay (halving about every 4-5 years).

From all those plots, I draw the following conclusions:

  • The quantity N'(t) is proportional to the number E*(t) of active Wikipedia editors at time t.
  • Most new articles are due to "regular" editors (say, those who make dozens of edits per month, over many months)
  • Until 2005, the number of new editors joining the pool of "regular editor" doubled every year. After 2006, practically no new editors joined the pool.
  • Practically all regular editors who had joined before 2005 survived 2006, with no significant decrease in activity.
  • Regular editors generally drop out (or slow down their activity) with a half-life of 4-5 years.

Why?

[edit]

Not just bad media

[edit]

One explanation that has been offered for this abrupt transition is the changed public image of Wikipedia in the aftermath of the Seigenthaler biography incident, which exploded in dec/2005. However, that explanation does not seem to account for the effects we see in the plots. A "media disaster", if it has any effect at all, should have caused a suden jump in the plots, and then a gradual recovery (total or partial) towards "business as usual" — as people forget the incident, or regain a more balanced view to its subject. This expectation seems to be the borne out by other past incidents, with more tangible basis and much wider audiences — such as Intel's stock prices after the Pentium FDIV bug, New Orleans's tourism after the Katrina disaster, and investor confidence after the Financial crisis of 2007-2009.

In the Wikipedia activity plots, we see no hint of a sudden drop at the time of the Seigenthaler incident (some indicators actually show a slight increase at that time), nor of a rebound in the ensuing months. The new article rate, in particular, showed no hint of the incident. That is all the more surprising if one considers that the bad publicity must have hurt much more deeply the regular editors and other Wikipedia fans than the general public, for whom Wikipedia is just one site among thousands.

It would seem that an episode of bad media exposure will have a permanent effect only when the subject cannot survive the initial transient, or when the the media incident exposes somthing terribly and fundmentally wrong that was previously ignored by the public. Neither of these attributes applies to Wikipedia.

The permanent change in the activty plots can only be explained by a permanent change in Wikipedia itself; and, moreover, it must be a change that affected the rate at which new editors join the pool, but not the work of established regular editors.

The ban on anonymous new articles

[edit]

Currently, the explanation that I find most likely, and reasonably in accord with the data, is the policy of restricting the creation of new articles to registered editors, thus excluding anonymous ("IP") users. The intention of this rule was to reduce the creation of "trash" articles, such as articles about their editors, or pranks like Seigenthaler's biography. This rule went into effect in December 2005, so its timing fits the plots quite nicely. It did not affect regular editors who were already established, but certainly put an additional hurdle on the path of would-be editors.

Indeed, I believe that the new rule had the (presumably unintended) effect of stopping the recruitment of new editors. Any reader can stilledit an article without registering, but Wikipedia was not built by casual IP editors; it is largely the work of regular editors, who have learned the difficult art of writing passable articles. Such editors are not born "fully grown" into Wikipedia: They generally start as fully inexperienced editors, and take a month or two to learn the basics.

It seems reasonable to assume that a regular editor typically begins his Wikipedia "career" by doing some edits as a IP user. My conjecture is that he will only feel the need to register after he has created one or more articles, and found the experience gratifying. The motivations, I think, are partly vanity (getting credit for his edits, even if under cover of a pseudonym), and partly the "parental love" he feels towards the articles he created, which can only be satified by becoming a registered user — since only those users can set up a Wikipedia:watchlist. Thus, by requiring registration before article creation, Wikipedia has effectively (if paradoxically) removed the main reason that IP editors could have for registering. And registration, in turn, is effectively a prerequisite for becoming a regular editor.

According to this conjecture, the regular editor typically creates one or two articles as an IP user, followed by hundreds of articles, over the next 4-5 years, as a registered editor. This model predicts that perhaps 90% or more of new articles created each month are due to regular editors who registered in previous months. According to this model, therefore, the immediate effect of the rule against IP-creation on the new-article rate N'(t) would have been negligible. On the other hand, the number of new users joining the regular editor pool each month (E*+(t)) would immediately drop to zero, and from then on the size E*(t) of that pool would decay at its natural "death" rate.

Hostility towards newbies

[edit]

Another permanent change that may have been contributing to the death of new editors is the general hostity against typical novice editors that came to predominate in Wikipedia policy, as defined (implicitly or explicitly, consciously or unconsciously) by administrators and senior regular editors. This hostility is manifested in several ways, including an aggressive article deletion policy and the widespread insertion of disparaging editorial tags in newbie articles.

Complexity and bureaucratism

[edit]

Another possible cause, which was noted by the Wikipedia Usability Initiative team, is the forbidding complexity of article wikisources. However, the growth in wikisource complexity does not appear to be related to the biography incident (except that the absence of new editors since 2006 has removed one factor that might have slowed that growth). Therefore, the damage that it is doing to Wikipedia has been increased gradually over the past 8 years, and does not explain the sudden change in the plot.

Nevertheless, the complexity and bureaucracy of Wikipedia as a whole has been growing out of control, and (speaking solely of my own experience) may be the major cause of burn-out among veteran editors. At the root of this disease are the human drives for power, an inherently flawed "consensus" procedure, and the lack of an explicit policy against bureaucratization.

Medicine too strong

[edit]

The increasing hostily towards newbies seem to have been in large part a reaction to the Seigenthaler's biography incident, like the ban on new articles by IP editors. The medicine may have stopped the disease, but seems to have been too strong: I firmly believe that it is now killing the patient.

I would compare those measures to the clearing out of all the undergrowth in a forest: the immediate effect on the trees would be nil or even positive, but from that point on the forest would steadily shrink, as trees that naturally died out would no longer be replaced. Indeed, from that moment on the forest, while seemingly alive and thriving, would be effectively dead.

What can we do about it?

[edit]

I believe that the present downward trends can still be reversed — but only if Wikipedia can muster the courage to reverse some of the bad decisions and habits that became entrenched over the past four years. This revolution should include a return to the original emphasis on content over form, and to the bold original dream of Wikipedia becoming the repository for "the sum of all human knowledge". It requires changing the official view of what "quality" means for Wikipedia, what is the scope of its coverage, and the official attitude towards inexpert editors and their contributions. As part of that, we must drastically reduce its complexity: by radically pruning the policies and rules, and by making the mechanisms less visible and their human interface more self-evident.

More specifically, I would propose the following measures:

Change the philosophy

[edit]
  • Change the definition of Wikipedia, from "Wikipedia is an encyclopedia" to
    Wikipedia is not an encyclopedia. It is not, and does not aim to be, similar to, better, or worse than classical encyclopedias, whether printed or digital. Wikipedia is a wikipedia, and only aims to be the best wikipedia that humanity can build.
  • Revert or relax the ban on article creation by anonymous users, so that they can create at least a couple of articles before registering.
  • Scrap the "notability" principle and replace it by the vague principle that "the level of detail in an article should be appropriate", leaving to editors to figure out what that means, case by case, acording to their own feelings. Thus, for example, short articles about small business or liberal professionals are acceptable as long as they abide by all other rules, including "what wikipedia is not".
  • Abandon the blanket requirement for explicit references in every statement of every article, keep it only as a desirable but unlikely to be reached ideal. Retain it, of course, for individual statements that have been the target of bona-fide disputes, or which might have serious consequences if incorrect (such as statements about living persons). Accept "passed under many eyes" as a sufficient guarantee of accuracy in non-sensitive topics. Let editors decide whether references are needed or sufficient, in a case by case basis.

Change its public image

[edit]
  • Admit to its users that, as a source of information, Wikipedia strives to as useful and accurate as it can; but it is not reliable, and cannot ever be. Specifically, a message should be displayed at the top of every page, and to every reader who is not logged in, saying something like
    All information in this site was provided by unsupervised volunteers and should not be trusted without independent verification. Please help Wikipedia by improving its articles if you can..

Reduce complexity

[edit]
  • Delete the entire contents of the "Wikipedia:" and "Wikipedia talk:" namespaces. In particular,
    • Delete all pages that describe Wikipedia policies, guidelines, standards, recommendations, etc.
    • Terminate the "Wikipedia 1.0" project.
    • Scrap the entire article assessment infrastructure and delete the quality seals from articles.
    • Terminate all WikiProjects and delete all their project-specific guidelines, standards,to-do lists, etc..
    • Delete all WikiProject banner templates from the article talk pages.
    • Scrap the "Articles for Deletion" mechanism.
    • Scrap the "Village pump" and other blogs, replace them by a microblog-like mechanism.
    • Scrap anything else that may exist in the "Wikipedia:" namespace but whose very existence I still ignore after four years of WP editing.
  • Create a single "help" page containing everything about Wikipedia policy and procedures that a user will need to read for his entire life as a Wikipedia editor, namely:
    • The pillars of wikipedia.
    • A cheat sheet for the markup syntax.
    • A brief primer of etiquette.
    • A link to the donations page (ok, I can live with that 8-)
    • A link to the microblog.
  • The help page should have a strict size limit, so that adding a new rule will require removing some other rule.
  • At the bottom of the edit window, replace "Content that violates any copyrights will be deleted. Encyclopedic content must be verifiable." by something like
    Wikipedia does not welcome malicious, frivolous or careless edits, and does not accept contents that is copyrighted material, original unpublished research, or which has purely commercial, promotional, personal, or entertainment purposes.
  • Delete all editorial template tags from all articles, replacing each tag by a plain text line in the talk page.
  • Enhance the search engine to accept queries about article size, number of edits, number of distinct editors, date of first/last edit, etc.. Then
  • Delete all stub tags from all articles.
  • Turn each navbox template into a ordinary "List"-like article, with its entries reformatted as wikilinks. Then:
  • Delete all navboxes from all articles, replacing them by "See also" links to the navbox-derived list article.
  • Move every infobox from every article to DBPedia, leaving behind only its main images (if any) and an interlink to the DBpedia entry.
  • Replace any template tags that link to Wikictionary, Wikisource, Wikimedia, etc. by plain interlinks in the "See also" section.
  • Replace the automatic category pages by ordinary "List"-like articles with the same contents. Then
  • Delete all category tags from articles and scrap the category machinery.

Change some procedures

[edit]

Add the following basic rules of behavior for editors:

  • Uniformity is not a goal: Editors are encouraged to create content and improve the style and organization of individual articles, by abiding to the basic generic article style rules, using their best judgement, and arguing with other editors, all on an article-by-article basis. Editors are discouraged from trying to achive uniformity of coverage, style or format across a group of articles — even if those articles could be considered part of a series, or deal with very similar topics.
  • One edit, one article: Mass editing — where a single user edits the visible contents of a large number of articles with relatively little effort, e.g. with the help of scripts — may be performed only with the express and specific authorization of Wikipedia Foundation, and only when strictly necessary for technical or legal reasons, or when absolutely uncontroversial (such as fixing undisputable errors of spelling or markup). It is not allowed, in particular, to perform mass edits in order to implement a change that is a matter of editor choice, no matter how consensual.
  • No canned or replicated content:The expansion of a template should not generate any visible words, word-size icons or images that are not part of the template call itself, including the template's name, keywords, and arguments; it should not generate more than one visible copy of each of those items; and it should not occupy significantly more screen space than the source of the template call and its argument images would. Thus templates can be used only to apply markup or special fonts, to create wikilinks and other non-visible effects, or to generate numbers (such as reference superscripts), isolated letters, and simple symbols (including symbol-sized images).
  • Article grouping should be external to the articles: Any organization of articles into groups or threads should be done through the creation of overview or index articles (such as "List of" articles), not by inserting thread links or any special kind of group navigation devices in the articles themselves. Any links from the member articles to the indices or master articles should be implemented as plain wikilinks, either embedded in the text or in the "See also" section.
  • Lead by example, not by rule: Rather than trying to impose your editorial tastes on other editors, persuade them by following or creating good examples.