Jump to content

User:Aymatth2/Expandstats

From Wikipedia, the free encyclopedia

This page gives statistics from a comparison of 27 articles flagged for expansion to 27 articles not flagged. It compares the degree to which the two sets of articles were expanded in 2010. Results are inconclusive due to the small sample size and difficulty in ensuring like-for-like articles in the samples. In summary:

  • For the flagged articles, average expansion was 12.17%
  • For the unflagged articles, average expansion was 17.45%

These results should be treated with great caution, for reasons given below. A larger and more carefully designed study may show quite different results.

Methodology

[edit]

To build the list of flagged articles, the editor

To build the list of unflagged articles, the editor used Special:Random to find a random sample of 27 articles. Articles were filtered to remove

  • Articles less than 2,500 characters long, since all but one of the flagged articles was larger than this
  • Articles created in 2010

Again, the editor recorded the size at 1 January 2010 and the size at 22 December 2010

Flagged articles

[edit]
Article Date Tagged 1 Jan 2010 size 22 Dec 2010 Size Expand %
Economy of Liechtenstein May-08 7,165 8,877 23.89%
Foreign relations of Kyrgyzstan Jan-07 12,676 12,725 0.39%
Geography of Nigeria Dec-08 5,639 16,605 194.47%
History of Latvia May-09 52,015 48,007 -7.71%
Intergovernmentalism Jan-08 3,253 2,139 -34.25%
International standard Jan-07 4,300 4,399 2.30%
Katanga (province) Nov-09 6,719 6,419 -4.46%
List of maritime explorers Feb-08 15,266 15,345 0.52%
Media studies Feb-09 15,850 19,244 21.41%
Military of Kyrgyzstan Nov-07 8,017 8,448 5.38%
National park Jun-09 11,886 13,367 12.46%
Neo Geo CD Dec-09 6,372 6,663 4.57%
Panarchism Aug-09 3,171 3,428 8.10%
Peace Dec-07 20,353 21,580 6.03%
Perimeter Oct-08 3,859 4,364 13.09%
Phantom kangaroo Oct-07 5,515 6,743 22.27%
Phonetics Nov-08 9,018 10,455 15.93%
Pope Martin V Jan-07 10,943 10,549 -3.60%
Pre-existence Jan-07 12,834 12,456 -2.95%
Quiver Jan-07 2,764 2,745 -0.69%
Register transfer language Dec-08 3,918 3,994 1.94%
RIPEMD May-07 2,636 2,708 2.73%
Scatology Jul-09 3,592 3,774 5.07%
Socialist law Jul-07 9,590 10,913 13.80%
Telecommunications in Jordan Jun-07 5,463 5,689 4.14%
Transport in Romania Oct-07 9,179 10,528 14.70%
Vehicle Dec-09 17,785 19,412 9.15%

Unflagged articles

[edit]
Article 1 Jan 2010 size 22 Dec 2010 Size Expand %
The Prisoner of Second Avenue 3,477 3,701 6.44%
Stephen Crisp 8,700 8,868 1.93%
Hyper-Graeco-Latin square design 4,113 4,113 0.00%
KSUA 10,382 10,472 0.87%
Hervé Yamguen 3,179 3,478 9.41%
Gaviotas 6,458 6,573 1.78%
Soyuz TMA-13 6,037 6,309 4.51%
Siege of Stralsund (1628) 20,070 21,283 6.04%
Hendren, Wisconsin 4,843 5,012 3.49%
Midnight (1939 film) 5,056 8,082 59.85%
Business transaction management 2,082 6,231 199.28%
Mike MacDonald 7,248 7,619 5.12%
Hostess Madness: Unparched Nectar 3,588 3,630 1.17%
Montgomery County, North Carolina 6,404 6,605 3.14%
Ed Montague (umpire) 4,026 4,510 12.02%
Parliament of Croatia 12,097 12,272 1.45%
Vietnam Children's Fund 5,108 4,673 -8.52%
Rumney, Cardiff 3,201 3,250 1.53%
Moose Cree First Nation 2,884 3,695 28.12%
David Peachey 9,864 10,697 8.44%
Dmytro Antonovych 1,997 3,916 96.09%
Billy Breakenridge 4,658 3,964 -14.90%
Weston Reserve University 5,345 6,002 12.29%
List of crossings of the Potomac River 23,013 23,136 0.53%
Quapaw, Oklahoma 6,015 7,246 20.47%
Virgil Villavicencio 2,824 3,111 10.16%
2005 University of Oklahoma bombing 20,334 20,447 0.56%

Comments

[edit]

Both samples were very small, so any conclusions have to be treated with extreme caution. A mechanized approach with samples of 1,000 or more articles would give much more credible results. There were noticeable differences in the characteristics of the two samples. Specifically:

  • Tagged articles were generally much larger than untagged articles. This was not caused by tagging. The size of the tagged articles on the date they were tagged was generally greater than the size of randomly selected articles
  • Tagged articles had much more edit activity than the random articles. In many cases there were more than 50 edits in 2010. None of the random articles had more than 50 edits in 2010

This suggests that tagging for expansion is more likely with articles where there is a higher level of editor interest than the average article. A more thorough study of the effects of tagging should adjust for this by comparing samples of tagged and untagged articles that have similar size and edit activity levels.