Jump to content

User:Tothwolf/rescued essays/AfD: formula for conflict

From Wikipedia, the free encyclopedia

This essay is an exploration of the matter of article deletion and certain related problems. I'm placing this in my personal user space for review; I invite any editor who becomes aware of it to edit it to improve it, particularly since my experience with Wikipedia is limited and I may have aspects of current process wrong, but also since I believe that many minds are inherently smarter and wiser, under the right conditions, than one. At this point, I assert the right to accept or revert edits without limit, according to my own opinion, and, hence, I remain responsible as the author -- though acknowledgement will be made for any retained contributions --; however, at some point, this document or some descendant from it may possibly be presented for wider community consideration and, at that point, my personal responsibility will cease. However, if this is in conflict with the general license required for all Wikipedia contributions, the latter applies.

I've become aware that quite a bit of editor time is being spent in debate over deletion of articles, and it's not uncommon for the debate to become acrimonious. From my examination of these debates, common themes emerge for me.

It's apparent that there is no true community consensus on what articles are appropriate for Wikipedia and what articles are not. There are two factions apparent: deletionists and inclusionists. There is, of course, a large grey area in between these, no implication is made here that these are cabals, though there is some level of cooperation between those who could be considered part of one or the other of these factions, particularly in certain areas of interest. For example, there are editors who have taken on the task of removing "quackery" from the encyclopedia, and who commonly provide "quackery" as a reason for deletion.

Yet "quackery," per se, is not among the accepted legitimate reasons for deletion, as quackery (even if "proven") can be notable. I've seen "proven quackery" given as an argument for deletion, yet, if proof exists in any way that the community can rely upon it, this, in fact, establishes verifiable source regarding the topic of the article. It's a delete argument that, if true, probably establishes that the article is on a notable subject, or reliable source would not exist as proof of quackery. Just show the proof!

If "quackery" is relevant to deletion, then there is a serious systemic bias introduced: an area of knowledge (the knowledge of quackery, fraud, deceit, lies, myth, etc.) which is excluded from the encyclopedia. If quackery is notable, this means it is reasonably possible that readers will come across it and will need information about it. Yet they will find none in this encyclopedia. Some necessary knowledge is missing from the compendium of necessary knowledge (one of the older definitions of encyclopedia).

The other serious problem with deletion due to "quackery" is that sometimes (not necessarily often), sound scientific analysis and advice sometimes has come to be identified, by recognized experts, as quackery. For example, the work of Dr. Robert C. Atkins, in recommending very low-carbohydrate, high-fat diets, was commonly called "quack medicine," even though it was actually based on what had once been common knowledge in the field, confirmed by substantial research. A public policy error was made, roughly thirty years ago, based on an expectation that certain recommendations would improve public health, even though that had never been proven. And then, since this was "important," i.e., allegedly, public health depended on it, contrary research became difficult to publish, it being considered "irresponsible to publish this," and difficult to fund, since, allegedly, there was a consensus, so why waste money investigating what is already known? All this has been rather thoroughly documented in the book, Good Calories, Bad Calories, by Gary Taubes, and the necessary revisions of the "common wisdom" are gradually penetrating the field as new research comes to light and the old research is rediscovered and re-examined.

In my opinion, such systemic bias, where an entire field is taken over by a particular point of view, not actually proven, isn't common, though I'm sure there are those who would disagree with me. But my objection here is not to noting in the encyclopedia that experts in some field consider some theory or treatment "quackery," for that is verifiable knowledge, if true -- they really do consider that --, but to *deleting* notable information based, in any way, on an argument that it is quackery. Tag it, don't bag it. None of this is an argument that minority positions, particularly the positions of very small minorities, should be given equal space with majority views.

Now, the criteria for deletion according to policy:

There are criteria for speedy deletion, but, unless the reason for speedy deletion is clear, articles where there is controversy over deletion do not cause conflict, in theory, because the existence of reasonable controversy should prevent speedy deletion. Obvious exceptions would be such as an article being inherently defamatory, copyright violation, or other clear policy violations (as judged by any administrator).

PROPOSED DELETION: (WP:PROD)

It's not necessary here to review the criteria; rather, the process for ordinary editors is clear and, in my opinion, fully adequate at this time. Any user may place a PROD template in any article. This is a notice that the article is proposed for deletion. Any editor other than the article creator may remove this tag, thus averting this form of deletion. (The creator of the article may request retention with a HangOn tag, which is essentially a request that someone else consider retaining the article or at least require further discussion before deletion.). If no objection to the deletion appears within five days, the article may be deleted by any administrator.

This is equivalent to standard parliamentary procedure. Essentially, any editor may add an article to the encyclopedia, and Wikipedia assumes that editors are acting in good faith, so the default is that all articles are suitable for inclusion. However, if there is any objection to the article, the original creation of the article is treated as if it were a motion for inclusion. This motion, then, requires a second, or else it will fail. If there is a second, ordinarily, no motion may properly be refused without further process (including, if demanded, vote). Large deliberative bodies deal with the problem of scale -- as the scale becomes large, it becomes easier to find a second for increasingly frivolous motions -- by reducing the set of those considering issues to some smaller body of members, such as a committee, or some form of representative democracy is used.

However, WP:PROD is really only a preliminary process, one which avoids discussion, it is like a motion for summary judgement in a court case: if no party objects, a judge would ordinarily grant it immediately, thus avoiding the inefficiency of unnecessary further process. Likewise a judge will grant summary judgement when the result is plain from the evidence acknowledged by the defense. (Or asserted with evidence by the plaintiff but not controverted or denied, I think. I'm not a lawyer.)

WP:PROD serves to delete articles that clearly cannot enjoy consensus; and a major safeguard exists: even if the article is deleted, policy is that it may be undeleted on proper request, and, in particular, upon any request that, if made prior to such deletion, would have averted it.

ARTICLES FOR DELETION (WP:AFD)

Here is where the problem lies. Any editor may propose a deletion discussion for any article. The process is a little more complex than a simple PROD tag, but still easy, and it can be automated, so some apparent deletionists may propose many articles for deletion, creating a large burden on editors who might wish to keep the articles, plus attracting participation by others who dislike the topic, the content, the presentation, or who sincerely believe that such articles inherently don't belong on Wikipedia.

Once the article has been proposed for deletion, the nominator is asked to state a reason or reasons for deletion. In theory, there are relatively few controversial reasons for deletion (and if a deletion isn't potentially controversial, there is little need for AfD process). However, in actual practice, arguments for deletion are commonly advanced that should be illegitimate (plus other arguments for retention are used that should likewise be irrelevant.) In particular, content issues, such as POV bias in the article, lack of proper sourcing, etc., should not be grounds for deletion, but these are commonly asserted. The reason that they should not be grounds for deletion is that generally those problems are more properly solved by removing POV bias or balancing it with additional content, challenging or removing unsourced material, and other ordinary edits, even to the extent of stubbing the article, which leave history in place, so no content of potential value is lost.

The common criteria for deletion that cause controversy are: (This list is not complete, because many criteria, in application, aren't controversial, and judgement is ordinarily simple; further, the policy avoids stating a complete list of reasons for deletion, because of WP:IAR. An argument should not be rejected merely because it is not in a list of legitimate arguments, because judgement should always be free of prior assumption.)

  • Advertising or other spam without relevant content (but not an article about an advertising related subject)
  • Content not suitable for an encyclopedia
  • Inappropriate user pages
  • Article information that cannot possibly be attributed to reliable sources
  • All attempts to find reliable sources in which article information can be verified have failed
  • Articles about newly-coined words or terms (i.e. neologisms) not supported by reliable sources.
  • Subject fails to meet the relevant notability guideline (WP:N, WP:BIO, WP:MUSIC, WP:CORP and so forth)

There is overlap among these criteria, and the criteria use what might be undesirable in articles: weasel words, though certainly some of them do, in fact, have more extensive explanations elsewhere. Content, to remain, must be "relevant," "suitable," "appropriate," in a word, "encyclopedic." "Encyclopedic" is obviously circular, in itself, but it is more narrowly described in other places; yet there is apparently no clear consensus on what is "encyclopedic," or the deletion discussions would not rage on as they do. Asserting what is clearly contrary to policy in deletion discussion, properly, should be, and sometimes is, considered disruption, particularly when repeated and after warning.

To cut to the chase, the goal of Wikipedia, as stated by Jimmy Wales, is to present "the sum of all human knowledge." There are two aspects to this simple statement which are unresolved, and this is the core of the problem:

What is "knowledge"? What is the "sum" of it?

The study of NPOV helps us with "knowledge." It is possible to state any alleged fact in a way that makes the fact true without legitimate controversy. These statements of fact, accepted by consensus -- which excludes frivolous objection, again determined by general agreement -- are "knowledge." If we properly frame the information in *any* article, it can be made NPOV; the typical way of doing this is to refer to accepted fact (does anyone challenge that there was a Nazi leader named Adolf Hitler?) and to attribute to source any other claim that is not accepted by consensus. The standard for NPOV that I was taught years ago (not under that acronym!) is that, if it has been attained, the adherents or proponents of all positions on an issue will say, "Yes, that is what we know or believe to be true." *Complete* NPOV is theoretically possible, but can become extraordinarily complex as the population of those involved increases. And here we come to the other term, which is where the rub lies:

The "sum" of knowledge is really short for "summary." The knowledge that is presented in an encyclopedia is not merely everything and the kitchen sink. It is only what is "notable."

And, in particular, "notable in an encyclopedia." What is "notable?"

I contend here that "notable" is not a characteristic of facts. Ultimately, all asserted facts are notable, or, more accurately, we cannot tell in advance what facts may be important or relevant or necessary for some readers, and, if anyone has noted a fact, it is, at least on a literal level, "notable." One person's useless trivia may be another person's critical clue to an important understanding, or perhaps a matter of pleasurable interest. "Notable" is, rather, more properly, the result of a process, part of the filtering of information by our collective intelligence. As the capacity for the depth of knowledge recorded and presented has expanded -- technically, we *could* include the kitchen sink, if anyone thinks it important enough to spend the time to write an article on it -- it does become possible to collect *all* knowledge, though it introduces new problems.

The major new problem is that of information overload. Any intelligent system, if provided with massive input, must have a means of filtering the input, or else the analytical process becomes overloaded and paralyzed. "Filtering" input, however, need not mean "censoring" or "deleting" it. Rather, it is about what information receives attention. It is about the categorization of knowledge, and the presentation of analysis or synthesis. This is what classical encyclopedias did on a smaller scale. They digested fields of knowledge and presented summary information about them, and any encyclopedia, to be useful, must do this in some way or other. However, the development of hypertext was long ago foreseen to provide for new possibilities, interconnected hierarchies of knowledge, where the reader may pursue details, perhaps reaching an end only when actually reaching the end of knowledge, what is known, up to and including what is immediately *becoming* known.

However, the deletionists, generally, do not see this as encyclopedic. Rather, they would limit the encyclopedia to a much more tightly defined notability, with notability being defined differently in each field; and, clearly, there are very artificial boundaries. These boundaries inherently will spark controversy, and, it can be predicted, those controversies will multiply, and the deletionists will come to think of themselves as holding back a huge tide of irrelevancy, as embattled defenders of the encyclopedia, valiantly deleting trivia, original research, and quackery from ever-increasing hordes of editors who imagine that what is important to them must be important for a "compendium of all knowledge."

We have here reduced "notable" to "what should be in compendium." What is this? Again, it's a process, not a fact. In any deliberative body, there must be some means to determine what motions are heard and debated. The first, and most basic requirement, is that there must be some agreement from someone other than the original mover that the motion should be considered. If a motion cannot gain a second, Robert's Rules, except when a meeting is informal, prohibits discussing it. I've seen a lot of informal meetings where this has been disregarded, with resulting major ennui and burnout. The analogy here is that the topic of the article must be considered notable by more than one person, as a bare minimum. "Human knowledge" could mean *all* human knowledge, but an obvious trim is all *shared* knowledge, and not merely shared as a fact, but in an opinion of importance or notability. This standard would be quite sufficient if the body of editors were relatively small and

collegial. If it is sufficiently notable to justify holding a debate over its notability, however, it's notable, period, for "debate" *notices* the topic of the article. Once it is "noticed," it is obviously notable. We preserve AfD discussions which are substantially larger, sometimes, than the articles over which debate took place, and which clearly consumed resources more than sufficient to, if applied, remove POV and other content problems from the articles. I don't think it should be controversial to point out the oddity and inefficiency of this.

However, the scale of Wikipedia is not limited, and we can expect it to increase. As the scale increases, there will increasingly be people who will assert notability on the argument that *all* knowledge is notable, or, more narrowly, that all knowledge in their field or in their own personal knowledge, is notable. Is the requirement for a "second" sufficient? I argue that it is, with a major caveat. The caveat is that the "second" be sincere and based on personal knowledge -- at least -- or, more stringently, on Wikipedia-recognizable reliable source. Here, again, we get into process. How could we set up process that would not be easily abused?

The AfD process suffers greatly from participation bias. I became aware of this whole issue through my involvement with Voting system articles, when I noticed an editor who had registered an account used only for article deletion; many of the articles in question were clearly defective, some were truly not notable even by the very liberal "Is there a second?" criterion, but others were known to me, personally, to be notable and even important to the field. Yet no knowledgeable editors showed up in the AfD to defend the articles (I was reviewing this after the fact, looking back). There would be the nominator, who would present, sometimes, a very brief argument, it takes seconds to type: "Original research." And then there would be a couple of editors, sometimes a very small number, who would vote "Delete," and the article was gone. If nobody who understands a field notices and participates in an AfD, the likelihood becomes substantial that the AfD will come up with a poor decision to delete, for what may be an obscure topic, to someone not familiar with the field, may be important to someone knowledgeable. Because articles in the voting systems collection were often written, originally, by experts and advanced students, simply creating a stub (or sometimes a more involved essay, in effect, that was not properly presented, but which was basically accurate though not yet sourced) and these experts often were not "Wikipedians," logging in frequently enough, perhaps never even checking or not even knowing about watchlists -- I did not for some years -- these articles were easy targets. And the pure deletionist I had found had a clear agenda: to remove from Wikipedia articles which could be used to criticise his political goals. He was creating a subtle POV bias through selective deletion.

There is inherent conflict in the current AfD process, and it can be predicted that it will increase; further, some of the possible steps whcih might be taken to avoid this conflict, if made pursuing a deletionist agenda, can also be predicted to increase public dissatisfaction with Wikipedia. I can say, already, that too many experts in my own field of interest hold a very negative opinion of Wikipedia. They have written articles which disappeared, they don't know how. They edit an article to insert expert opinion and the edit is changed by someone who clearly doesn't know the field. Some of this problem may be inevitable, but the *deletion* part is not. When an article is not objectionable (i.e., offensive or illegal content), there really isn't any strong reason to delete it *at all*, rather, it may instead have some inferior position in a hierarchy of notability, it might be stubbed or redirected, leaving it in the original editors contribution history, who can then review what happened to it. I'm fairly sure that I created at least one article that was deleted, and I had no clue even where to begin looking for it until recently. If it was PROD or speedy-deleted, there might be no regularly accessible record, unless someone put a notice on my Talk page, and some nominators don't do that.

There is no inherent conflict between reliability and inclusion. As an example of a step which might be made that would establish reliability in the presence of high inclusion, all new articles could be automatically tagged as unconfirmed, not fact-checked. Essentially, new articles would be "proposed" articles, and they might stand that way even for years. Some of the Wikipedia competitors appearing seem to be moving toward that direction. With proper process, a "checked" article on Wikipedia should meet or exceed the standards of any peer-reviewed publication, and, if that were done, Wikipedia itself could become a reliable source, and, quite possibly, more reliable than present print publications, which often have review by a very limited and narrow set of specialists in a field, who miss errors in the article that would be discovered through a more general examination.

How to actually accomplish this is beyond the scope of this essay, but I'll suggest that it is by following what nature did long ago in dealing with information overload: there is a network of connections between neurons, a fractal, self-assembled, which filters information efficiently, keeping "trivia" from consciousness while allowing all of it -- in pieces, not all at once -- to become available for consideration when needed. The hierarchy does not terminate in a single neuron or small oligarchy "in charge." Rather, it is a community, making decisions, when functioning in a healthy way, by consensus; fast decisions in the presence of "debate" are not made except in emergencies, or in play where error is without harm.

Without clear *process* for determining standards for inclusion, increasing amounts of editor time will be consumed by debates over deletion; deletion is inherently controversial (except when the article creator consents). As I have written elsewhere, the assumption of good faith which is normal may be an error with considering deletion, for those who comment on deletion nominations frequently assume that the facts presented by the nominator are true unless controverted, when, clearly, there is at least one editor who, if we assume good faith, thought the topic was notable, and who proved the value of it -- to himself or herself -- by taking the time to write it. So there is already conflict, in the very nomination, unless the standards are very clear. And they are not. As I noted above, most of them are circular, standards, where made more specific, vary from field to field with little apparent order, and exceptions abound, such as a number above 200 that survived AfD because some editors put effort into showing notability. Other editors objected on the grounds that notability could similarly be shown for *every* number, which, while it may be an exaggeration given finite opportunity for research, is true. Why did "200" become the standard? It's totally arbitrary. (However, notable numbers don't necessarily need their own articles!)

There are at least two paths to take us around this mess: if there is to be a high bar for inclusion, there should be crystal clear standards for notability and suitability; then, if someone disagrees, the effort of that person can be to refine the standards, to define exceptions, a debate which will then be of wider application, rather than being required piecemeal, one article at a time, which is hugely inefficient. Further, actual article deletion, in many cases, may be inappropriate, particularly where notability or verifiability are marginal issues. No harm is done by redirection in those cases, or stubbing to a bare minimum, since edit history is then preserved. Redirection or stubbing does not require debate, it is really only an ordinary edit, easily reversible; and if controversy appears, there is the whole dispute resolution process with its graduated layers of effort and increasing possibility of consequence for frivolously wasting the time of many editors.

The other path is inclusionist, with the information overload problems of high inclusion being addressed through better categorization and classification of information into hierarchies. The question for an article, then, comes to be, not whether it should be included or not, but whether or not it can be classified as reliable (which is properly a matter of consensus -- excepting top-level policy established and maintained by the trustees) and where it fits into a hierarchy of knowledge. Insisting on reliability of source should not be exclusionist, it is always possible to make an article verifiable and NPOV, though this may require opening up new possibilities for sourcing (but always within the intention of verifiability). In particular, original research is *not* inherently unencylopedic, properly presented. Unconfirmed original research is certainly problematic, but not when presented *as such*, and when it is categorized into an emergent class, not given prominence but only presented to those who are looking for exactly that level of detail. Sometimes a peer-reviewed article appears on a single piece of research that has not been confirmed; it is published because the reviewers considered the finding significant; the research itself, the results, are considered notable, but it will always be cited as unconfirmed, not as fact. Indeed, this publication is essential to the scientific process. Knowledge is a process, not a fact.

Increasingly, though, in certain fields, emergent research is not published in peer-reviewed journals, for that process takes far too long and other researchers need rapid access. We allow citation of original research on Wikipedia when it has been published through a peer-review process, and we also allow citation, sometimes, of original research self-published (in print or otherwise) by authors considered experts, as defined by prior peer-reviewed publication. What I've seen in my own field of interest, however, is that new and original research is presented through mailing lists, often by "amateurs" who are nevertheless experts, and recognised as such by the community, where these ideas are subjected to severe and thorough criticism by a large number of knowledgeable readers, including formal experts and others. What survives that is at least as reliable as what is published in peer-reviewed journals; sometimes what gets published in the latter way would not survive one day on the mailing lists, I've seen blatant errors. But the problem here is that determining what has "survived" is a matter of judgement, and, so far, the mailing lists I have in mind have no formal process for determining this, there is merely a general consensus which appears, which is why articles on topics which are well-known and generally accepted in my field have sometimes had trouble avoiding deletion on the basis of lack of "reliable sources." Yet anyone familiar with the field knew the topic of the article, could define it from the name, etc.

The inclusionist path, with categorization, is the only one I see being ultimately practical unless the encyclopedia is stuffed into a straightjacket. It is really an extension of NPOV. It *could* go to absolute inclusion, where only offensive or illegal material is removed, with other unsupported material sitting in a bottom layer of material below common attention, or to a somewhat higher standard of requiring a second; but the latter will ultimately require some filtering, as seconds *on principle* will increasingly appear, that is, there will be those who will second *anything*, such as what so-and-so had for dinner on a certain date. You know, you never can tell.... We could start with "Menu choices of the rich and famous," and then proceed to "Menu choices of the poor and obscure." It's all knowledge, but certainly not all notable *in itself*.

My own preference: set up the processes for classification of knowledge, improve them and make them more efficient. Set up a process for validating articles, so that reliability of sourcing becomes a non-issue *for deletion*. (Offensive or illegal articles, such as defamatory BLP, are a separate problem; anyone creating such an article should reasonably expect to see it deleted promptly.) And make the definition of notability clear; I would suggest a new layer in the hierarchy of editors: a class of editors from whom a second must be obtained for an article to be reasonably secure from deletion. This class can grow as needed to handle the volume; this would be a sub-community with developed standards for inclusion, and would include editors knowledgeable in every field, ideally, and it could be made easy for new editors to find and ping an editor in their field; where there is controversy, it may be possible to have many different POVs represented among these validators, but the standards for being and continuing to be a validator would include a knowledge of policy, a willingness to follow it, and an ability to work and serve, as well as to help form, the community consensus.

The validation of such a privileged editor would be sufficient to avoid ordinary deletion, and the validating editors would be charged with developing and maintaining standards of notability, and routinely enforcing their own community standards, as well as policy. Such editors might also be fact-checkers as well, and the general agreement of such editors with knowledge in the field would be sufficient to label an article version as validated. The user interface could be changed so that a user option would be "Only show validated articles," or, alternatively, there could be some clear difference in how unvalidated articles are shown. Cross-reference to unvalidated articles would be limited or flagged, or held in an extended cross-reference subpage with only one reference from each article to "unvalidated articles." The identity of these editors would be known and available and categorized by field of interest, where appropriate, so that any ordinary editor could appeal to one of them, just as ordinary editors may now appeal to administrators and others for assistance. This would be a much broader group than the administrative group, without any powers that could be considered punitive, as the harm from inappropriate inclusion is generally small and easily reversible. (Spamming many validators with requests would be offensive, it should really only be appropriate to ask for validation from one validator at a time, and asking, serially, more than a certain number should be discouraged, the total number might be as low as two or three.)

It's a hierarchical solution, resembling in some aspects what natural selection discovered and implemented, long ago. There are plenty of details which I have not described, such as how it would be determined that there is "general agreement of editors with knowledge in the field." This must be efficient to work, it cannot be a matter of ordinary vote, which either suffers from participation bias or requires far too much attention from too many. In any case, such a system should not be implemented without broad discussion, and the details would, in this, come to light and be worked out during that process. I'm really only suggesting, here, the broad outlines, even though I do anticipate more of the details than I have described.