User:Filll/Controversial Article Project/Calculation

Ideas

episodic, quiescence, senscence, editor fatigue
centrality mean mode median others
T editor overturn temporal constant (rate of new editor appearance)
S sock puppets false positives, false negatives receiver operating curve
R article dependent temporal scale circadian and weekly rhythms, event driven secular changes modulated by hardware limitations, edit conflicts
relative expected values of I as a function of time
editor style, editor fatigue, editor orientation, gauging editor style

Metric proposals

Article metrics

There are many types of articles on Wikipedia. Some are short low quality stubs, and some are long but low quality pieces of text. Some are well referenced and some are not. Some are high quality featured articles, and others contain all kinds of informational and formatting infelicities. Some of these are easy to measure. For example, text length is easily computed, and the article rating can be used as a proxy for article quality. It is interesting to consider different methods for measuring how controversial a given article is as well.

There are also many types of editing on Wikipedia. For example, some users are involved in vandal patrol, others in deletion of inappropriate articles, others in salvaging articles inappropriately deleted, some in watching out for violations of biographies of living persons policy, some in dealing with outside legal threats and complaints, some in undeleting deleting articles, some in administrative forums like the Arbitration Committee or a variety of noticeboards, and some in editing articles. There are even many types of editing that are associated with editing articles. The two crudest classifications of article editing might be classified as article building and article defense.

Article building involves edits that create article content. These edits can appear on both the article mainspace page and on the article talk page, but will appear on the mainspace page for the most part. Frequently some aspect of article construction will take place in sandboxes or offwiki, and it becomes tougher to capture the contribution of article building in sandboxes and in particular, offwiki article construction. For example, User:Silence drastically expanded the Objections to evolution article started by User:ScienceApologist with the incorporation of more than 70 kilobytes of material from his sandbox version in User:Silence/Evolution, which drew on other sandbox material produced mainly by User:Filll and User:Orangemarlin at Talk:Evolution/Objectionstoevolution.

Article defense includes things like vandalism patrol, and debating and discussing the article content on the talk page and other pages, such as user pages. The number of reverts to a mainspace page can provide information about antivandalism efforts as well as capturing edit warring, which is also associated with the "debate and discussion" activity on a page. The number of reverts to talk pages can also provide information about vandalism activity, particularly reverts of talk page blanking. Some reverts of small sections of talk page edits are indications of responses to tendentious editing.

A controversial article will tend to be characterized by a greater ratio of article defense activities to article building activities than a less controversial article. Let C, the controversy ratio, be defined as the ratio of article defense activity to article building activity. Articles that are locked for any appreciable amount of time will have correspondingly larger values of C, since less building activity is able to take place because of the editing restiction, while the defense, debate and discussion activity can continue unabated. Of course, if articles are locked for an extended period of time, interest can wane in the discussion and the debate and discussion will probably eventually dissipate.

To roughly approximate the controversy ratio C, let

D= number of talk page edits per month

B= number of main page edits per month

and define

C0=D/B = ratio telling how controversial article is, roughly as an estimate of the controversy ratio C.

If Rv is the number of reverts to the mainspace page, then a better estimate of the controversy ratio C is

C1=(D+Rv)/(B-Rv)

An even more accurate estimate of C would be

C2=(D+Rv-S)/(B-Rv+S)

where S are the number of constructive suggestions for article improvements placed on the talk page. One way to estimate S might be to sample some talk pages of varous kinds and estimate the ratios S/B and S/D which initially can be assumed to be constant, but could be allowed to differ for different categories of article (for example, assuming a dependence of S on C).

Obviously, considerably more sophisticated estimates of C are possible. And clearly, better methods for distinguishing controversial from noncontroversial articles are possible. However, a few examples demonstrate that C0 is a reasonable statistic for distinguishing controversial from less contentious articles. For example, consider following articles and the associated values of C0:

Trying to build a controversial article is more difficult than trying to build a noncontroversial article because many of the edits are changed or reverted away from their original intent. These edits have to be repeated, or their exact form debated. Therefore, article building efforts on noncontroversial articles are expected to be more effective and productive than article building efforts on controversial articles. The ratio C can be used to get some indication of how productive an edit to build an article is. Define the efficiency E of a building edit as

Efficiency E=100%/(C+1)

Ideally, for a noncontroversial article, C=0 and E=100%. If C=1, then for every building edit, there is a defensive edit, and about half the activity on the article is expected to be productive. If C=1, the efficiency E is 50%. If C=2, the efficiency E is about 33.3% and if C=3, the efficiency is only 25%.

Of course, the controversy ratio C is expected to be temporally dependent, and dependent on article quality as well. The article rating may be taken as a proxy for article quality. It is conjectured that the rate of administrative intervention on controversial articles is higher than on noncontroversial articles. Therefore, some measure of administrative intervention, including page locks, article content RfCs, content RfCs for the most active article participants, Arbcomm proceedings for article contributors and articles, mediation proceedings, Administrator's noticeboard threads and other noticeboard discussions should be correlated with the controversy ratio C.

Editor metrics

A given editor will make both building edits to an article, and defense edits to an article. Their defense edits are valuable as defense, but the value of their building edits is eroded by the amount of controversy associated with the article. A reasonable ansatz for an expression that captures this is to multiply the number of mainspace edits of an editor by the factor 1/(1+C). Reducing the number of mainspace edits by the factor 1/(1+C) attempts to capture the devaluation of mainspace edits in a controversial article. If the article is noncontroversial, C=0 and all mainspace edits are effective, so that productivity is high. As C increases, fewer of the mainspace edits are effective and productivity is lowered accordingly.

Editors put most of their time and energy into a few articles. There are of course articles that an editor adds one or two edits to, fixing a spelling mistake or a punctuation error, but these do not determine much about the orientation of the editor. The articles on which the editor spends most of his or her time and compiles the largest number of edits are more useful for determining editor orientation. A simple first step towards quantifying editor activity is to examine the average value of the controversy ratios of the articles a given editor has edited most often.

Editor article activity can take place both on the mainspace page and on the talk page, so most generally an editor will be most active on talk pages and mainspace pages that are different. The average controversy ratios for greatest mainspace activity and greatest talk page activity can be useful information to show which articles the editor chooses to devote their resources to.

Just because an editor has put more effort into a given article does not capture the amount of intensity of this effort. It is important to provide some measure of effort intensity to distinguish different editors with different styles. An expression that tries to describe the intensity of editor building activity, corrected for productivity, is

Ib=b*f/(1+C0)

where b is the monthly average mainspace edits to an article, and f is the fraction of mainspace edits due to the given editor. An expression that is directed towards capturing the intensity of editor defense activity on an article is

Id= d*f

where d is the monthly average talk page edits and f is the fraction of talk page edits that the editor is responsible for.

A vector valued function can be created for both these, with separate components for each article the editor contributes to. A potentially useful summary statistic is produced by averaging the values of Ib and Id for the editor over the articles they have devoted the most effort to. For example, some arbitrary limits for the number of mainspace pages and talk pages to consider can be imposed and a sum of Ib and Id over the most heavily edited pages for an editor formed, labelled IIb and IId respectively (for integrated intensity).

Editor value

A prime question that arises is what is the relative value of users of different characteristics, such as different levels of experience. Using these sorts of metrics, one can begin to start thinking about measures of relative value to the task of building the encyclopedia.

There are currently 6.9 million registered accounts on the English version of Wikipedia.^{[citation needed]} The vast majority of these are idle, and only a few accounts are active. The chance that a given new editor will remain and be productive is small, but it can be estimated.

Consider a sample of N0 users at time t0. As time goes on, fewer and fewer of these users will remain on Wikipedia to contribute to either article building or article defense. Let N(t) be the editors from the original group that are active at time t. Activity can be defined in some arbitrary way, such as at least one edit in the preceding 3 months. N is a monotonically decreasing function of t, and N(t0)=N0. The probability that any given member of the original N0 users will remain at time t is P(t)=N(t)/N0. The probability that a user at time t1 will remain at time t2 is P(t1,t2)=N(t2)/N(t1). P(t)=P(t0,t)=N(t)/N(t0)=N(t)/N0.

The average article building activity of the remaining N(t) users at a time t can be estimated from an arithmetic average or other measure of centrality of IIb for the group. The average article defense activity of those remaining N(t) users can be estimated in a similar way from the values of IId for each editor in the category. Therefore, one can define functions IIb(t) and IId(t) that describe the evolution of estimated IIb with time, for editors that have been active for time t.

The expected value of IIb for a given editor in the initial group of N0 editors is

<IIb> = P(1)IIb(1) + P(2)IIb(2)+ P(3)IIb(3)+ ...

Similarly, the expected value of IId for a given editor in the intial group of N0 editors is

<IId> = P(1)IId(1) + P(2)IId(2) + P(3)IId(3)+ ...

The expected value of IIb for a given editor in the remaining group of N(t) editors is

<IIb>_t = P(t+1)IIb(t+1) + P(t+2)IIb(t+2) + P(t+3)IIb(t+3) + ...

and the expected value of IId for an editor remaining active at time t is

<IId>_t = P(t+1)IId(t+1) + P(t+2)IId(t+2) + P(t+3)IId(t+3) + ...

Clearly, an editor remaining at time t is more valuable for building articles than one of the initial editors by a factor of

<IIb> / <IIb>_t

and more valuable for defending articles by a factor of

<IId> / <IId>_t

However, a better determination of relative editor value would recognize that a significant fraction of the N0 editors at time t0 will be involved in significant disruption of the project. For example, some of those N0 will only be involved in mainspace vandalism, which has to be reverted or otherwise corrected. This directly increases the value of Rv, which adds to the defense costs and reducing the amount of article building efforts. Some of these N0 will only be involved in contentious talkspace discussions, consuming more defense resources and reducing the amount of article building efforts. Some will be involved in both sorts of destructive activites. Others will generate a mix of both productive edits, and imposed defense costs.

There is also a multiplier effect involved with editors who engage in contentious talk page discussions, advocating positions against the principles of Wikipedia tendentiously. In some cases, a single experienced editor can handle the queries. In other cases, for a persistent editor, it can consume the efforts of many experienced editors, particularly if administrative sanctions requiring testimony are required, as in some Administrator's Noticeboard cases, conduct RfCs and Arbcomm proceedings.

These are the less obvious costs imposed on the project by permitting disruptive users to continue to edit unfettered. Therefore, it is useful to try to quantify this effect so that it can be recognized and measured and so that reasonable decisions can be taken to increase total productivity.

Other measures of intensity

For a talk page that recieves a huge number of visits and edits, an editor that makes a massive contribution but then experiences editor fatigue and leaves the talk page might have his efforts diluted by the method suggested above. For example, if a talk page has had 10,000 edits over 5 years, and a given editor has made 500 edits over a 2 or 3 month period, which is enough to give them more talk page edits than any other editor, then the above method does not capture the intensity of the experience. Few editors will push past 500 edits in a prolonged contentious and repetitious discussion; typically they experience editor fatigue and move on to other topics, or leave Wikipedia entirely. Other methods might include using the rank of the editor in an edit contribution-ordered list of talk page editors, or a ratio of his number of edits to the number of edits of the editor with the most edits, or for an extremely nonstationary situation, just comparing his edits to the total number of edits over the time he was active.

A metric that expresses this might include the rank of a given editor's talk page contributions. If one editor was ranked first or second on 5 different controversial talk pages and another was ranked 200th on 5 different controversial talk pages, or first or second on 5 different noncontroversial article talk pages, different styles of article editing are suggested. Metrics can be designed which readily distinguish between these styles. For example,

average talk page rank is interesting, but does not make allowances for the differences between editing noncontroversial and controversial talk pages.
total number of talk page editors divided by talk page rank gives a measure corrected for controversy, since controversial pages have more editors, typically
controversy ratio C divided by talk page rank is another measure of talk page activity that makes allowances for the differences between controversial and noncontroversial articles.

More metrics

There are several possible metrics for understanding editor behavior.

(1) Total controversy index of most edited mainspace pages and talk pages.

(2) Building intensity metric (bim) Ib and defense intensity metric (dim) Id for the most-edited mainspace pages and talk pages. Can also have controversy-weighted versions of the same metrics.

(3) Effectiveness

As a heuristic, suppose that the probability of any given edit being judged as suitable is

p= 1/(1+C)= Nm/(Nt+Nm)

where Nm is the number of mainspace page edits and Nt is the number of talk page edits

If the controversy ratio C=Nt/Nm=0, then p=1

The probability of a given edit being judged unsuitable is therefore q=1-p

Define effectiveness E(n) as the same as Q(n), the probability of surviving n edits.

Q(n)=1-pⁿ

(4) Productivity P

P=(Number of bytes attributed to editor remaining in document)/(Total number of bytes contributed by editor to documnet)

(5) Building and defense leadership; a measure showing what fraction of articles mainspace pages and talk pages were created by a given editor.

Editor examples

This sort of analysis can be used to gain insight into an editor's editing style, and activity on Wikipedia. For example, User:Filll and User:DGG started Wikipedia at about the same time, in the fall of 2006 (October 2nd and September 6th, respectively). Both have around 30,000 edits in April of 2008 (Filll has 28945 edits as of this writing on April 21, 2008 and DGG has 33291 edits). User:DGG edits many more pages and contributes many fewer edits per page than User: Filll does, however (18829 pages for DGG and 2943 pages for Filll while DGG contributes 1.77 edits per page and Filll makes 9.84 edits per page) . What else can we determine about their editing styles using this analysis?

User:Filll has the most edits on the mainspace pages of Frère Jacques, Introduction to evolution, Translations of Frère Jacques, Level of support for evolution, and Hot spring (ordered by number of edits), which have estimated controversy ratios C0 of 0.26, 0.76, 0.18, 1.39 and 0.6 respectively. The sum of these 5 estimated controversy ratios is 3.19. Similarly, User: Filll has the most edits on the talk pages of Black people, Intelligent design, Evolution, Homeopathy, and Expelled: No Intelligence Allowed, again ordered by decreasing number of edits. The estimated controversy ratios of these articles are 1.02, 1.60, 1.19, 1.81 and 1.72 respectively. The sum of these estimated controversy ratios is 7.34.

The suggested measure of building intensity, corrected for productivity for the five articles which User:Filll has contributed most mainspace edits to is 8.69, 12.67, 15.67, 7.94, and 3.51 corrected edits per month, given in order of decreasing number of edits. The sum of these building intensity estimates is 48.48. The suggested measures of defense intensity for the five articles which User:Filll has contributed most talk space edits to is 9.87, 8.36, 4.99, 6.97 and 66.99 scaled edits per month, given in decreasing numbers of edits. The sum of these defense intensity estimates is 90.82.

User:DGG has the most mainspace edits in the articles Open access, E-book, Printing press, Johannes Gutenberg, Movable type, and Phage therapy, in order of number of edits. The estimated controversy ratios C0 of these articles are 0.10, 0.05, 0.11, 0.15, 0.28, 0.15 respectively, given in order of number of mainspace edits. The sum of the controversy ratios for DGG's most edited articles is 0.69. User:DGG has the most talk page edits in the articles Johannes Gutenberg, Printing press, Joseph Schlessinger, Movable type, and E-book, in decreasing order of number of page edits. The estimated controversy ratios C0 for these articles are 0.15, 0.11, 0.42, 0.28 and 0.05, respectively. The sum of these controversy ratios for DGG's most edited talk pages is 1.01. Clearly, the articles that DGG edits most often and the talk pages that DGG edits most often are far less controversial by this measure than the articles that Filll edits.

The suggested measure of building intensity, corrected for productivity for the six articles which User:DGG has contributed most mainspace edits to is 1.13, 0.94, 0.61, 0.53, 0.32 and 0.52 corrected edits per month, given in order of decreasing number of edits. The sum of the first five building intensity estimates is 3.53. The suggested measures of defense intensity for the five articles which User:DGG has contributed most talk space edits to is 0.76, 0.34, 3.92, 1.10 and 0.88 scaled edits per month, given in decreasing numbers of edits. The sum of these defense intensity estimates is 7.00. Again, the article building intensity and the article defense intensity for the articles and talk pages that DGG edits most often are far less than the corresponding summary statistics for User:Filll.

Articles to look at

Type C

Homeopathy 6548, 1246, 87.2 (11823, 530, 157.5)
- 1.81
Depleted Uranium 2080, 643, 27.5 (1483, 216, 22.9)
- 0.71

9/11 conspiracy theories 9009, 2392, 210.8 (5501, 624, 132.1)
- 0.61

Evolution 11116, 3128, 147.1 (13270, 1265, 177.2)

- 1.94

Intelligent Design 10693, 2542, 141.1, (17076, 746, 228.0)
- 1.60

Black people 6312, 1526, 84.3 (6416, 473, 99.7)
- 1.02

What the Bleep Do We Know!? 1953, 561, 43.1 (3987, 219, 87.5)
- 2.04

Electronic voice phenomenon 2949, 442, 63.8 (4207, 137, 91.6)
- 1.43

Type NC

Cardiology 369 edits, 200 editors, 4.5/month, (21, 12, 0.4)
- 0.06

Sunflower 1150, 647, 16.2 (69,47, 1.1)
- 0.06

Plastic 1990, 1077, 26.6 (98, 61, 1.9)
- 0.05

Saint Pierre and Miquelon 954, 416, 12.8 (140, 44, 2.6)
- 0.15

Isle of Wight 1642, 598, 21.9 (236, 65, 4.3)
- 0.14

Cattle 4661, 2282, 62.2 (526, 284, 7.7)
- 0.11

Data to collect

Total edits
Total months
Highest 6 months edits, 12 months

Users to study

Type C

Durova

- 697 Joan of Arc 67.6, 2050, 697/5117 (20.5, 294, 400/1406)
- 196 Cultural depictions of Joan of Arc 19.3, 184, 196/ 559 (1.8, 25, 14/50)
- 171 List of brain tumor patients 17.0, 151, 171/471 (2.6, 28, 26/73)
- 107 Do not give Hitler posthumous victories.
- 63 Navajo rug 8.8, 30, 63/124 (2.4, 14, 8/24)

- 400 Joan of Arc 67.6, 2050, 697/5117 (20.5, 294, 400/1406)

- Matt Sanchez 94.6, 165, 14/1298 (277.8, 163, 91/3322)
- 55 Muhammad/images/Archive 1 deleted
- Palestinian costumes 25.1, 41, 37/430 (18.9, 16, 46/218)

- Testosterone poisoning 4.4, 78, 33/180 (3.2, 18, 39/89)

- 28 Sweden Democrats 10.1, 167, 0/701 (18.8, 33,

28/588)

Raul654
JzG
MONGO

- 448 Retreat of glaciers since 1850 44.2, 208, 448/1221 (20.8, 60, 204/565)
- 435 Yellowstone National Park 28.7, 830, 435/2176 (3.9, 74, 40/193)
- 407 Shoshone National Forest 19.8, 131, 407/652 (3.3, 29, 37/93)
- 369 George W. Bush 519.9, 12782, 369/40329 (296.9, 3713, 766/18311)
- 351 Yellowstone fires of 1988 26.4, 41, 351/437(1.0, 7, 7/16)

- 766 George W. Bush 519.9, 12782, 369/40329 (296.9, 3713, 766/18311)
- 608 September 11, 2001 attacks 166.4, 4115, 338/12600 (262.1, 1590, 608/13609)
- 464 Collapse of the World Trade Center 56.5, 670, 276/2729 (72.3, 237, 464/3273)
- 204 Retreat of glaciers since 1850 44.2, 208, 448/1221 (20.8, 60, 204/565)

- 165 Allegations of state terrorism by the United States 166.7, 407, 38/3501 (579.2, 379, 165/12173)

Orangemarlin
Filll
- - Top 5 Mainspace
- Frère Jacques 19, 1, 403/699 (6.3, 1, 96/181)
- Introduction to evolution 135.6, 2, 372/2262 (104.1, 2, 274/1709)
- Translations of Frère Jacques 29.9, 1 355/574 (6, 1, 56/104)
- Level of support for evolution 47.7, 1, 296/735 (65.2, 1, 298/1025)
- Hot spring 9.2, 1, 259/640 (0.9, 1, 17/39)

- - Top 5 Talk page
- Black people 84.2, 28, 36/6308 (99.7, 1, 635/6416)
- Intelligent design 141.1, 30, 45/10693 (227.9, 5, 626/17073 )
- Evolution 147.1, 35, 37/11116 (177.2, 3, 565/13270)
- Homeopathy 87.2, 4, 219/6548 ( 157.5, 4, 523/11823 )
- Expelled: No Intelligence Allowed 239.1, 1, 224/1885 (468.8, 1, 462/3233)

Dave souza
Hrafn
Silence
User:KillerChihuahua
User:FeloniousMonk

Type NC

DGG
- Open access, 14 per month, ranked number 1, 66/747 (2 per month, ranked number 1, 19/71)
- E-book, 19.3 per month, ranked number 1, 61/1199 (2.5 per month, ranked number 1, 19/54)
- Printing press 21.7 per month, ranked number 2, 51/1627 (2.3 per month, ranked number 1, 27/184)
- Johannes Gutenberg 27.6 per month, ranked number 3, 46/2063 (4.2 ranked 2, 57/316)
- Movable type 5.3 per month, ranked number 3, 30/393 (6.4 ranked 4, 19/111)
- Phage therapy 6.2 per month, ranked number 2, 28/292

(1.8 per month, ranked number 1, 13/44)

- 42 Johannes Gutenberg 27.6 per month, ranked number 3, 46/2063 (4.2 ranked 2, 57/316)
- 27 Printing press 21.7 per month, ranked number 2, 51/1627 (2.3 per month, ranked number 1, 27/184)
- 23 Joseph Schlessinger 89.2, 4, 20/414 (29.5, 4, 23/173)
- Movable type 5.3 per month, ranked number 3, 30/393 (6.4 ranked 4, 19/111)

- E-book 19.3 per month, ranked number 1, 61/1199 (2.5 per month, ranked number 1, 19/54)

GTBacchus

- Cocaine 82, 3347, 115/6140 (10.6, 322, 6/609)
- The Beatles 176.6, 5128, 91/13229 (41.5, 524, 10/3107)
- Hipster (1940s subculture) 24.0, 513, 86/1144 (2.6, 9, 0/19)
- Lysergic acid diethylamide 61.8, 2399, 63/4627 (11.0, 279, 63/822)
- Christmas controversy 54.3, 527, 58/1575 (23.8, 170, 58/668)

- 143 Abortion 120.8, 2482, 143/9066 (104.7, 626, 143/7844)
- 63 Criticism of Wikipedia 83.5, 1388, 6/3389 (40.8, 430, 63/1654)
- 52 Abortion/First paragraph deleted.
- Christmas controversy 54.3, 527, 58/1575 (23.8, 170, 46/668)

- Iraq War 149.9, 2241, 10/9287 (151.6, 817, 40/5606)
- Pro-choice 20.1, 541, 23/1280 (11.0, 103, 33/477)

Tim Vickers

LaraLove
WillowW
- 682 Encyclopædia Britannica
- 511 Equipartition theorem
- 417 Action potential
- 322 Photon
- 320 X-ray crystallography

- 135 Photon
- 69 Introduction to general relativity
- 64 Harold Pinter
- 62 Action potential
- 56 Encyclopædia Britannica

Type S

Whig
EdChampion
- 36 Edmund the Martyr
- 8 Pope Gregory I
- 4 Augustine of Canterbury
- 3 Cuthbert of Lindisfarne
- 3 Michael (archangel)

- 97 Edmund the Martyr

Restepc
SargonXii

Data to collect

Most controversial articles edited
rank out of editors on these controversial articles

Single purpose accounts

A common phenomenon observed on Wikipedia is the existence of "Single purpose accounts". Single purpose accounts that edit only one or two kinds of article on one or two subjects (which can be determined by estimating graph theoretic distances in the category graph; [1]), can be productive if they are involved mainly in building activities. Single purpose accounts that are involved significantly in "debate and discussion", including reverting activity, on controversial articles probably are more disruptive than productive. Using these metrics it should be possible to distinguish "Single purpose accounts" from other accounts automatically and readily, and to distinguish productive from disruptive single purpose accounts.

Temporal dependence

Clearly, both article editing and editor activity exhibit episodic behavior and other temporal dependence. For example, there are obvious and expected circadian and hebdomadal patterns as well as seasonal fluctuations in editing. Holiday periods and early morning hours for the largest population of contributors are likely to exhibit different editing activity than other times. Some articles have distinct episodic editing patterns, like Translations of Frère Jacques, which experienced a burst of editing activity in its second month, and has exhibited relative quiescence since then for well over a year. Other articles, like Expelled: No Intelligence Allowed experience a secular increase in editing activity associated with some real world event, like the release of the associated film. Still other articles like Introduction to evolution experience sharp increases in editing activity associated with article creation and the efforts associated with ratings improvement (the two sharp jumps in editing in September of 2007 and December 2007/January 2008 were associated with a GA and an FA application, respectively). Of course, locking an article can also create strong temporal dependence in article editing data.

In the case of strongly episodic data, other measures of centrality such as the median and mode are probably more appropriate than the arithmetic mean. Depending on the distribution involved, even more exotic measures of centrality such as the harmonic and the geometric mean might be appropriate, or nonparametric measures such as the jackknife and bootstrap.

More fine-grained descriptions of editing activity can be useful for investigating other issues. For example, time series of edits with a daily sampling period, or even more frequent, would be valuable to see the effects on the article summary statistics such as the controversy ratio C, or building and defense rates b and d. These fine-grained statistics can be created using a moving average or other low pass filtered version of the raw data. Removal of harmonics to prewhiten the data can reduce spurious correlation bias.

It has been claimed that some editors are sufficiently skilled at conflict resolution and consensus building that their talk page contributions are far more efficient than those of typical users. For example, anecdotal evidence suggests that User:Silence and User:KillerChihuahua are more efficient than many others in discussion of policy on article pages and dissuading editors from engaging in disruptive behaviors. Some assert that certain behavior changes enable them to quell disputes far more rapidly and efficiently and improve editing productivity. Hopefully mediators would also exhibit this purported skill. It could be conjectured that various administrative actions such as Wikipedia:Article probation or sanctioning of disruptive editors or achievement of GA or FA status can all affect the controversy ratio C. These can all be investigated using more frequently sampled editing data.

It is to be expected that in the case of removal of disruptive editors or a short term contribution of an editor with special consensus building skills might decrease the controversy ratio C, temporally correlated with event onset, and C might experience a relaxation or reversion to the mean with a characteristic "time" constant. Hypothetically, an "editor overturn time constant" might depend in some simple way on the rate at which new editors appear on a page. If there are new editors that appear every day or two, something done to calm the waters a few days previously might be already forgotten and archived, and the effect of the intervention might not last long.

The relevant time scale is probably dependent on article editing activity. Articles and the associated talk pages receiving a low number of edits evolve very slowly, with a new edit every few weeks or months. More aggressively edited articles can exhibit 100 kilobytes or more of new talk page edits per day. The only limits in some cases are the limitations of the hardware, leading to edit conflicts.

Editors also exhibit nonstationary editing behavior. A common phenomenon is "editor fatigue" where an editor gets tired of editing, or tired of editing a certain kind of article, or retires from some or all tasks and activities on Wikipedia. Quite often this is associated with involvement with controversial articles, or administrative actions stemming from controvesial articles. Quantifying this can serve a useful purpose in understanding editor fatigue, looking for patterns in editor fatigue, and developing methods for avoid editor fatigue.

Editor overturn time constant

Clearly the rate of new editor appearance at an article is biased upwards by the presence of sock puppets and meat puppets. Therefore the raw data should be corrected for this effect. We use some techniques for distinguishing sock pupets, but of course this algorithm has a receiver operating curve and exhibits type I and type II errors, that is, false positives and false negatives. To correct for this effect, one needs to estimate something like the Blackstone ratio, so the probability of a given new editor being a sock puppet can be estimated.

A useful adjunct to this analysis would be the application of computational linguistics and discriminant analysis techniques to sock puppet detection.

Article production to administrative activity ratio

Another interesting statistic for measuring editor activity is the ratio of edits to the mainspace to the edits to noticeboards such as WP:AN, WP:AN/I, as well as other administrative pages such as RfAr, Arbitration enforcement and user conduct RfCs. Most activity on these sorts of pages is focused on some sort of dispute. Some editors have little to do with the creation of articles, or the defense of articles, but focus most of their time on Wikipedia on assorted disputes that are more removed from article-writing. Some of this activity shows up on the pages such as WP:AN, WP:AN/I, RfC pages and Arbcomm pages.

Comparing two versions of an article

There are several ways in which one version of an article can different from another:

text can be added
text can be removed
text can be rearranged
one section of text can be substituted for another section

Measuring an editor's productivity on an article

Assume that the final version of an article is the consensus version of the article. For each editor involved in editing this article, what value were each of his edits to the article? How much of each editor's editing of the article contributed to building the final consensus version of the article, and how much of each editor's editing was involved in defense of the consensus version of the article?

Modeling and understanding a sequence of edits

We have a sequence of article versions, beginning with A₀ and ending with article version A _n:

A₀ $\longrightarrow$ A₁ $\longrightarrow$ A₂ $\longrightarrow$ ... $\longrightarrow$ A_n

Call the final article A_n the "consensus" version of the article, A _C.

Each edit is a transformation of the text of the article. The first transformation is T₁, the second transformation is T₂.

T₁ (A ₀) = A ₁

T₂ (A ₁) = A ₂

T₃ (A ₂) = A ₃

.
.
.

T_n (A _n-1) = A _n = A _C

In general, T_m (A _m-1) = A _m for 1 $\leq m\leq$ n.

There are several types of edit:

(i) addition of text

(ii) subtraction of text

(iii) rearrangement of text

For example, the substitution of one piece of text for another can be described as a combination of an addition of a piece of text, and a subtraction of another piece of text. The rearrangement of text can also be described as a composition of the addition and subtraction of text segments.

Let the parentheses { } describe the concatenation of strings of text. If $\alpha$ is a string of text and so is $\beta$ , then the string of text that is just $\alpha$ followed by $\beta$ is denoted by { $\alpha$ , $\beta$ }. Define the operator s( ) as the size of a string of text, in bytes. So s({ $\alpha$ , $\beta$ })=s( $\alpha$ )+s( $\beta$ ). Define the size of A_m to be s_m, so s₁=s(A₁), s₂=s(A₂), s_m=s(A_m), and s_C=s(A_C). Define the operator n( ) as the length of a string of text, in number of elements. So n({ $\alpha$ , $\beta$ })=n( $\alpha$ )+n( $\beta$ ). If A_m has n_m elements, n₁=n(A₁), n₂=n(A₂), n_m=n(A_m), and n_C=n(A_C).

The addition of a string to an article can be symbolized by the transformation T⁺, so T⁺(A, $\alpha$ )={A, $\alpha$ }. The subtraction of a string of text from an article can be symbolized by the transformation T^-, so T^-({ $\alpha$ , $\beta$ }, $\beta$ )= $\alpha$ . The rearrangement of an article is a transformation T^R that permutes the elements of the article. For example, T^R({ $\alpha$ , $\beta$ })={ $\beta$ , $\alpha$ } is an example of a rearrangment of the strings in the article { $\alpha$ , $\beta$ } that might be observed in practice. Obviously, the transformations T^R will be members of S_N, the symmetric group on N elements, for an article of N elements. However, given spelling conventions and grammar and other restrictions, it is likely that each of the possible transformations T^R will not be equally probable.

A given article A is formed by a concatenation of text strings:

A = { $\alpha$ , $\beta$ , $\gamma$ , ...}

s( $\alpha$ )= s _$\alpha$

s( $\beta$ )= s _$\beta$

s( $\gamma$ ) = s _$\gamma$

and so on.

s(A)= s_$\alpha$+ s_$\beta$+ s_$\gamma$+ ...

An indicator function $\mathbb {I}$ for the article A is a finite sequence of ones and zeros of length n(A). The indicator function can be used to choose which elements of the sequence A to select.

Suppose that 1_{n_a} = (1, 1, 1, ...1) is a vector of length n_a.

Let O _{n_b} = (0, 0, 0, ...0) is a vector of length n_b.

Let $\mathbb {I}$ = { 1_{n_a} , O _{n_b} } = ( 1, 1, 1, ...1, 0,0,0, ...0) be a vector of length n_a+n_b, with the first n_a elements taking the value unity and the last n _b elements being zero.

$\mathbb {I}$ • $\mathbb {I}$ = n_a

assuming that "•" is a vector dot product.

The indicator function selects those elements of the sequence A to include and those elements to elide.

Suppose A = {a, b}

A[ {1_{n _a}, 1_{n _b}}] =A={a,b}

A[{ 1_{n _a}, O_{n _b}}] =a

A[ {O_{n _a}, 1_{n _b}}] =b

A[ {O_{n _a}, O_{n _b}}] = $\varnothing$

Initially, assume that there are no repeated text sequences within any article.

Define a product between two characters, $\alpha$ and $\beta$ , < $\alpha$ , $\beta$ > to be unity if $\alpha$ = $\beta$ and zero if $\alpha$ ≠ $\beta$ . This product can be used to create a coincidence matrix M for comparing two article versions. The propreties of the coincidence matrix M allow the extraction of the transformations that take the article from one version to another.

The simplest variant of a coincidence matrix M between article A with elements A_i and article B with elements B_j has elements M _ij = <A_i,B_j>. More generally, two articles can be compared by examining them in longer chunks rather than one character at a time. Segments of text of length N can be compared using the a coincidence matrix with elements M_ij = $\bigcap _{m=0}^{N}$ < A_i+m, B_j+m > where N is the comparison segment length.

Adopt as a convention that the row index of M corresponds to the index of the older version of the article, and the column index of M corresponds to the index of the more recent version of the article. If the older version of the article is a string of length n₁ and the more recent version of the article is a sequence of length n₂, then the coincidence matrix M has dimension n₁ x n₂; that is, M has n₁ rows and n₂ columns.

Addition of text

Suppose that A₁= {a,b} is a version of an article composed of the concatenation of strings a and b. Suppose that the next version of the article is A₂ = {a, b, c}, composed of the concatenation of strings a, b, and c. The coincidence matrix M in this case is

${\begin{pmatrix}I_{n_{a}}&O_{n_{a},n_{b}}&O_{n_{a},n_{c}}\\O_{n_{b},n_{a}}&I_{n_{b}}&O_{n_{b},n_{c}}\end{pmatrix}}$

where $I_{n_{a}}$ is an identity matrix with n_a rows and n_a columns and $O_{n_{a},n_{b}}$ is a zero matrix with n_a rows and n_b columns.

In general, M will be a matrix with block rectangular submatrices.

Define a function $\phi (x)$ where $\phi (x)$ =0 if x=0 and $\phi (x)$ =1 if x≠0

On vectors, ${\overrightarrow {v}}$ = ( v₁ , v₂, v₃, ... v_N ), $\phi$ ( ${\overrightarrow {v}}$ ) = ( $\phi$ ( v₁ ), $\phi$ ( v₂ ), $\phi$ ( v₃ ), ... $\phi$ ( v_N ) )

Define

$\sum _{j}M_{ij}=(\mathbb {I} ^{R})_{i}$

$\sum _{i}M_{ij}=(\mathbb {I} ^{C})_{j}$

$\mathbb {I}$ ^R is a vector of length n_a+n_b

$\mathbb {I}$ ^C is a vector of length n_a+n_b+n_c

Use $\mathbb {I}$ ^R and $\mathbb {I}$ ^C as indicator functions for the versions of the article A₁ and A₂.

A₁ [ $\mathbb {I}$ ^R] = A₁, material in A₁ that appears in A₂

A₁ [1- $\mathbb {I}$ ^R] = $\varnothing$ , subtracted material

A₂ [ $\mathbb {I}$ ^C] = A₁, material in A₂ that appears in A₁

A₂ [1- $\mathbb {I}$ ^C] = c, added material

Subtraction of text

Suppose that A₁={a,b} and that A₂={a}. The coincidence matrix M in this case is

${\begin{pmatrix}I_{n_{a}}\\O_{n_{b},n_{a}}\end{pmatrix}}$

A₁ [ $\mathbb {I}$ ^R] = a, material in A₁ that appears in A₂

A₁ [1- $\mathbb {I}$ ^R] = b, subtracted material

A₂ [ $\mathbb {I}$ ^C] = a, material in A₂ that appears in A₁

A₂ [1- $\mathbb {I}$ ^C] = $\varnothing$ , added material

Subtraction and addition of material

Suppose that A₁={a,b} and A₂={a,c}. The coincidence matrix M in this case is

${\begin{pmatrix}I_{n_{a}}&O_{n_{a},n_{c}}\\O_{n_{b},n_{b}}&O_{n_{b},n_{c}}\end{pmatrix}}$

A₁ [ $\mathbb {I}$ ^R] = a, material in A₁ that appears in A₂

A₁ [1- $\mathbb {I}$ ^R] = b, subtracted material

A₂ [ $\mathbb {I}$ ^C] = a, material in A₂ that appears in A₁

A₂ [1- $\mathbb {I}$ ^C] = c, added material

Substitution of material

Suppose that A₁={a,b} and A₂={a',b} where a and a' are the same size, so s(a)=s(a')=s_a. In this case the coincidence matrix M is

${\begin{pmatrix}O_{n_{a}}&O_{n_{a},n_{b}}\\O_{n_{b},n_{a}}&I_{n_{b}}\end{pmatrix}}$

In this case, M is a square matrix, with n₁=n₂= n_a + n_b. M is symmetric, so M^T=M and M² = M.

$\mathbb {I}$ ^R= $\mathbb {I}$ ^C = diag(M)

A₁ [ $\mathbb {I}$ ^R] = b, material in A₁ that appears in A₂

A₁ [1- $\mathbb {I}$ ^R] = a, subtracted material

A₂ [ $\mathbb {I}$ ^C] = b, material in A₂ that appears in A₁

A₂ [1- $\mathbb {I}$ ^C] = a', added material

Scoring transformations

If a transformation moves the article closer to the final, consensus version, A_C, then it is either a positive contribution to a "building" score $S_{B}^{}$ or a positive contribution to a "protection" score $S_{P}^{}$ . If the transformation results in the introduction of a segment of text for the first time, then classify it as a transformation that produces a positive contribution to $S_{B}$ . If the transformation is a repetition of a previous transformation, or results in the reintroduction of a segment of text that was previously deleted but appears in the consensus version, then it is classified as a transformation producing a positive contribution to $S_{P}$ . A transformation that results in an article version that is further away from the final version of the article A_C will contribute to a decrease of the protection score.

Suppose that article version A₁ is transformed into article version A₂. The article version A₂ will in general include some text that A₁ does not, and exclude some other material. As before, computing row and column indicator functions $\mathbb {I} _{12}^{R}$ and $\mathbb {I} _{12}^{C}$ , so the total amount of material that is different between the two is the concatenation of added and subtracted text: $\Delta$ = { A₁ [1 - $\mathbb {I} _{12}^{R}$ ] , A₂ [1- $\mathbb {I} _{12}^{C}$ ] }. Forming a coincidence matrix by comparing $\Delta$ with the consensus version of the article A_C, this string $\Delta$ can be apportioned into material that is not contained the consensus article, $\Delta _{}^{-}$ = $\Delta _{}^{}$ [1- $\mathbb {I} ^{R}$ ] and material that is contained in the consensus article, $\Delta$ ⁺= $\Delta$ [ $\mathbb {I} ^{R}$ ]. Designate these text differences between A₁ and A₂ as $\Delta _{2}^{-}$ and $\Delta _{2}^{+}$ , respectively. Similarly, there are text differences $\Delta _{1}^{-}$ and $\Delta _{1}^{+}$ between article versions A₀ and A₁, and in general there are text differences $\Delta _{k}^{-}$ and $\Delta _{k}^{+}$ between article versions A_k-1 and A_k. Define $\Delta _{k}^{H+}$ as the concatenation of all the previous text differences $\Delta _{}^{+}$ , { $\Delta _{1}^{+}$ , $\Delta _{2}^{+}$ , $\Delta _{3}^{+}$ , ... $\Delta _{k-1}^{+}$ }. Forming a coincidence matrix comparing $\Delta _{k}^{+}$ and $\Delta _{k}^{H+}$ , $\Delta _{k}^{+}$ can be partitioned into $\Delta _{k}^{B+}$ , material that does not appear in $\Delta _{k}^{H+}$ , and $\Delta _{k}^{P+}$ , material that does appear in $\Delta _{k}^{H+}$ .

Each editor k has both a building score $S_{B}^{k}$ and a protection score $S_{P}^{k}$ . These scores satisfy the following relations:

$\sum _{k}S_{P}^{k}=0$

$\sum _{k}S_{B}^{k}=s(A_{C})$

Alignment score

Define a function $\rho$ on pairs of articles, A₁ and A₂ as $\rho$ (A₁,A₂) = $\sum$ diag(<A ₁, A₂ >) for articles A₁ and A₂. If A₁ = A₂, then $\rho$ takes the value n₁, the number of elements in A₁. If A ₁ is a rearranged version of A₂, and therefore A₁ has the same number of elements as A₂, then $\rho$ will be less than n₁, depending on how many elements are still aligned in the two versions of the article. Let $\rho$ ^k= $\rho$ (A_k, A_C )= $\sum$ diag(<A _k [ $\mathbb {I}$ ^R] , A_C [ $\mathbb {I}$ ^C] > ), the number of elements in the kth version of the article that appear in the consensus version of the article in the same position. Define the alignment score of the kth version of the article, $S_{\rho }^{k}$ = $\rho$ ^k- $\rho$ ^k-1, with $\rho$ ⁰=0 and $S_{\rho }^{0}$ =0. That is, the kth alignment score $S_{\rho }^{k}$ is the difference in the number of identically positioned elements appearing in the consensus version between the kth version of the article and the immediately preceding version of the article.

$\sum _{k}S_{\rho }^{k}$ = $\rho$ (A_C, A_C)=n(A_C)

Editor scores

An article has editors 1, 2, 3, ... n_E, and the jth edit is made by editor e_j. The total building score $S_{B}^{k}$ of the kth editor is $\sum _{j}S_{B}^{j}\delta _{e_{j}k}$ . The total protection score $S_{P}^{k}$ of the kth editor is $\sum _{j}S_{P}^{j}\delta _{e_{j}k}$ . The total alignment score of the kth editor is $S_{\rho }^{k}$ is $\sum _{j}S_{\rho }^{j}\delta _{e_{j}k}$ .

The building fraction due to the kth editor is

$f_{B}^{k}={\frac {\sum _{j}S_{B}^{j}\delta _{e_{j}k}}{s(A_{C})}}$

The building efficiency of the kth editor is

$E_{B}^{k}={\frac {\sum _{j}S_{B}^{j}\delta _{e_{j}k}}{\sum _{j}\delta _{e_{j}k}}}$

This describes how many bytes are produced per edit of an editor. This can be skewed by the use of the sandbox to develop text, judicious use of the preview button and the time since the editing was done.

The protection contribution coefficient of the kth editor is

$C_{P}^{k}={\frac {\sum _{j}S_{P}^{j}\delta _{e_{j}k}}{s(A_{C})}}$