User talk:The Earwig/Archive 8

This is an archive of past discussions with User:The Earwig. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 5

Archive 6

Archive 7

→

The Signpost: 22 October 2012

Special report: Examining adminship from the German perspective

Arbitration report: Malleus Fatuorum accused of circumventing topic ban; motion to change "net four votes" rule

Technology report: Wikivoyage migration: technical strategy announced

Discussion report: Good articles on the main page?; reforming dispute resolution

News and notes: Wikimedians get serious about women in science

WikiProject report: Where in the world is Wikipedia?

Featured content: Is RfA Kafkaesque?

The Signpost: 29 October 2012

News and notes: First chickens come home to roost for FDC funding applicants; WMF board discusses governance issues and scope of programs

WikiProject report: In recognition of... WikiProject Military History

Technology report: Improved video support imminent and Wikidata.org live

Featured content: On the road again

Recent research: WP governance informal; community as social network; efficiency of recruitment and content production; Rorschach news

The Signpost: 05 November 2012

Op-ed: 2012 WikiCup comes to an end

News and notes: Wikimedian photographic talent on display in national submissions to Wiki Loves Monuments

In the media: Was climate change a factor in Hurricane Sandy?

Discussion report: Protected Page Editor right; Gibraltar hooks

Featured content: Jack-O'-Lanterns and Toads

Technology report: Hue, Sqoop, Oozie, Zookeeper, Hive, Pig and Kafka

WikiProject report: Listening to WikiProject Songs

User notifications at WP:DRN

There seems to be some problems with the notification of users named on DRN cases:

Wikipedia talk:Dispute resolution noticeboard#Autonotification

--Guy Macon (talk) 01:18, 14 November 2012 (UTC)

Replied as best as I could on that thread. — Earwig ^talk 01:55, 14 November 2012 (UTC)

The Signpost: 12 November 2012

News and notes: Court ruling complicates the paid-editing debate

Featured content: The table has turned

Technology report: MediaWiki 1.20 and the prospects for getting 1.21 code reviewed promptly

WikiProject report: Land of parrots, palm trees, and the Holy Cross: WikiProject Brazil

The Signpost: 19 November 2012

News and notes: FDC's financial muscle kicks in

WikiProject report: No teenagers, mutants, or ninjas: WikiProject Turtles

Technology report: Structural reorganisation "not a done deal"

Featured content: Wikipedia hit by the Streisand effect

Discussion report: GOOG, MSFT, WMT: the ticker symbol placement question

Hi

I thought that you deserved something a bit extra for all of the amazing work you've done for the project.
I've nominated you for a gift from the Wikimedia Foundation!

Legoktm (talk) 20:54, 14 November 2012 (UTC)

Speaking of that ;)

Hello, The Earwig. Please check your email; you've got mail!
It may take a few minutes from the time the email is sent for it to show up in your inbox. You can remove this notice at any time by removing the {{You've got mail}} or {{ygm}} template.— Jalexander--WMF 00:54, 22 November 2012 (UTC)

Template:AFC statistics

Your bot User:EarwigBot is doing a fine job in maintaining Template:AFC statistics, but there are a small number of entries there that shouldn't IMO be there, but that the bot adds nevertheless. I am not sure of the bot code needs tweaking, or if anything needs to be done about these pages instead. The pages I mean are the ones I manually removed here. This includse pages like Inch Parish, Wigtownshire, a redirect where neither the source nor the target seem to have any AfC categories. Can you take a look? It's not urgent, and there is no reason to stop this bot task obviously, but the lists are very long and removing a few that have no place there would make it a bit lighter. Fram (talk) 10:20, 13 November 2012 (UTC)

Hi! It seems there have been some issues with the Toolserver's replication lately and the bot has missed the edits where these submissions were declined. I had planned in the past to make the bot periodically check that submissions in the chart are in fact still submissions, but I've been busy and haven't gotten around to that. For now, I've removed the ones you tried to remove earlier. Manually checking the Toolserver's database, it seems that its count of pending submissions is wrong (~766 members instead of the expected 495), seeming to indicate that its database is corrupt and I can't guarantee that the generated statistics will be accurate until this is fixed. The current situation is confusing to me and I can only try to work around it, but we'll see. Thanks. — Earwig ^talk 22:15, 13 November 2012 (UTC)

Thanks! Fram (talk) 07:24, 14 November 2012 (UTC)

Template:AFC statistics is no longer working, exceeds limits, and gives only one of the four sections anymore. Fram (talk) 16:16, 22 November 2012 (UTC)

I removed the accepted and declined submissions to shorten it a bit, but there's really not much else I can do. This happens relatively often when the backlog goes above several hundred pages, and the best solution right now is to review as many as many as possible. — Earwig ^talk 04:17, 23 November 2012 (UTC)

The Signpost: 26 November 2012

News and notes: Toolserver finance remains uncertain

Recent research: Movie success predictions, readability, credentials and authority, geographical comparisons

Featured content: Panoramic views, history, and a celestial constellation

Technology report: Wikidata reaches 100,000 entries

WikiProject report: Directing Discussion: WikiProject Deletion Sorting

Has The Earwig retired...

...from Category:Undated AfC submissions? — Wylie Coyote 13:21, 28 November 2012 (UTC)

Your heading scared me! Uh, well, the bot's been a bit broken at the moment, and I still haven't finished the task that cleans that category. It's on my to-do list and I'll get to it eventually. — Earwig ^talk 22:47, 28 November 2012 (UTC)

Articles for creation needs YOUR help!

Articles for creation is desperately short of reviewers! We are looking for urgent help, from experienced editors, in reviewing submissions in the pending submissions queue. Currently there are 2258 submissions waiting to be reviewed and many help requests at our help desk.

Do you have what it takes?

Are you familiar with Wikipedia's policies and guidelines?
Do you know what Wikipedia is and is not?
Do you have a working knowledge of the Manual of Style, particularly article naming conventions?
Can you review submissions based on their individual merits?

If the answer to these questions is yes, then please read the reviewing instructions and donate a little of your time to helping tackle the backlog. You might wish to add {{AFC status}} or {{AfC Defcon}} to your userpage, which will alert you to the number of open submissions. Plus, reviewing is easy when you use our new semi-automated reviewing script!
Thanks in advance, Nathan2055^{talk - contribs}

Sent on behalf of WikiProject Articles for creation at 22:27, 29 November 2012 (UTC). If you do not wish to receive anymore messages from this WikiProject, please remove your username from this page.

Disambiguation link notification for November 30

Hi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Digite, Inc., you added a link pointing to the disambiguation page Mountain View (check to confirm | fix with Dab solver). Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.

It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot (talk) 10:55, 30 November 2012 (UTC)

Center for Hispanic Leadership

The Earwig: I was recently creating a page on Center for Hispanic Leadership and it was deleted today due to ambiguous advertising/promotion. I realize now that I had inappropriate information in the entry and completely agree with the deletion. That being said, I would like to create the page again listing only information about the Center for Hispanic Leadership (CHL), the Hispanic Training Center and the CHL Chapters. I plan to refer to the CHL Founder, Glenn Llopis, who has a page on Wikipedia already, just once as a point of reference and then list reputable articles and references at the end of the page.

Again, I wanted to reach out and acknowledge my misstep and would like to be afforded the opportunity to create a CHL page that is in-line with Wikipedia's guidelines.

Please let me know your feedback.

Kind regards, Marisa Salcines — Preceding unsigned comment added by Gabri7elle (talk • contribs) 04:25, 3 December 2012 (UTC)

Hi Marisa,

You're certainly welcome to restart work on the page provided that you follow our guidelines and write from a neutral point of view. First, I would ask you to read this short page (WP:42) carefully and ensure that the Center meets its requirements. If so, then an article should be possible. You can go through the article wizard again to create it. Please let me know when you've done so, or if you have any further questions.

Thanks! — Earwig ^talk 05:43, 3 December 2012 (UTC)

Center for Hispanic Leadership

Hi, I have created a new entry for Center for Hispanic Leadership. Please let me know if it meets the standards.

Thanks 06:21, 4 December 2012 (UTC) — Preceding unsigned comment added by Gabri7elle (talk • contribs)

Review

Apparently your reviewed Template:Derry and passed it. I feel that you have called your competence as a reviewer into question due to your failure to properly flag the blatant bias in the template whilst going on to add it to several articles. The overt nationalist/republican bias of the template is plain to see straight from the off and until it is fixed it should not be added to articles. Mabuska ^(talk) 14:41, 4 December 2012 (UTC)

Hi. Please forgive my ignorance here. Can you be more specific as to where exactly this "bias" is? I have no knowledge of Derry, nor the rest of Ireland. The template as a whole seems neutral to me (it is, after all, a template containing links to relevant pages; it does not contain any prose that could fail WP:NPOV). If you have a problem with specific links, point them out and then we can deal with those, but I see no reason why you claim the template as a whole is bad. Thanks. — Earwig ^talk 22:23, 4 December 2012 (UTC)

Virtually every link is something to do with nationalist/republican areas, sports and history etc of the city. It's unionist history and many things not connected to nationalism have been entirely overlooked. If you wish i could compile a very comprehesive list of the bias. Just look at the changes i and CanterburyTail have since made to the template to see some of the problems with it. It is not as biased now as i've evened it out a bit, though more work can be done to it. Other than bias the formatting was wrong too in regards to displaying categories as table links.

Things are never simple in regards to Ireland and Northern Ireland things here. Always best to get the associated wikiproject notified to check things out. Mabuska ^(talk) 00:18, 5 December 2012 (UTC)

...right. Good catch on the category issue; that's the fault of the helper script, although I should have caught it. Again, I don't see what makes the template so horribly wrong. Looking over your changes, I can see what you're getting at with certain things being omitted, but once again, I don't think the "bias" is as widespread as you claim it to be, and I don't understand the trouble with simply inserting or removing links, as you've done. Remember that it is not the job of AFC reviewers to make submissions perfect, but rather to check that they don't fail some basic criteria; the rest is left up to users like you. NPOV is one of these criteria, of course, but the template appears to pass it without issue to someone without intimate knowledge of Northern Ireland's history and the various subtexts. Since I clearly have no idea what I'm doing, I'll gladly leave the template in the hands of users who do. At the very least, I hope that you make sure that it gets added to the articles it mentions – eventually. — Earwig ^talk 01:39, 5 December 2012 (UTC)

Copyvio detector

Could you please check if your copyvio detector is still working? I am getting the following error message:

Error message

Error !

SiteNotFoundError: Site 'all' not found in the sitesdb.

<%include file="/support/header.mako" args="environ=environ, cookies=cookies, title='Copyvio Detector', add_css=('copyvios.css',), add_js=('copyvios.js',)"/>\ <%namespace module="toolserver.copyvios" import="main, highlight_delta"/>\ <%namespace module="toolserver.misc" import="urlstrip"/>\ <% query, bot, all_langs, all_projects, page, result = main(environ) %>\ % if query.project and query.lang and query.title and not page:

The given site (project=${query.project | h}, language=${query.lang | h}) doesn't seem to exist. It may also be closed or private. <a href="//${query.lang | h}.${query.project | h}.org/">Confirm its URL.</a>

% elif query.project and query.lang and query.title and page and not result:

/home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 159: raise SiteNotFoundError(error) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 186: namespaces) = self._load_site_from_sitesdb(name) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 135: site = self._make_site_object(name) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 340: return self._get_site_object(name) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 112: site = self._sitesdb.get_site(sitename) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 149: self._update(sitename) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 154: self.sync("all") /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/__init__.py, line 146: self._exclusions_db.sync(self.site.name) ./toolserver/copyvios/checker.py, line 28: result = page.copyvio_check(max_queries=10, max_time=45) ./toolserver/copyvios/__init__.py, line 23: page, result = get_results(bot, site, query) pages/copyvios.mako, line 4: <% query, bot, all_langs, all_projects, page, result = main(environ) %>\ /home/earwig/.local/solaris/lib/python2.7/site-packages/Mako-0.7.2-py2.7.egg/mako/runtime.py, line 817: callable_(context, *args, **kwargs)

By the way, thank you for creating the tool! It's very useful – I have been using it when reviewing AfC submissions. The Anonymouse (talk • contribs) 16:42, 5 December 2012 (UTC)

Silly bug on my part; fixed it and will update the tool in a moment. Do note that I haven't technically "released" its current version yet (there are still a number of things I wanted to improve first), but you're free to use it at your own discretion. Thanks for the report, and I'm glad you find it useful. — Earwig ^talk 21:55, 5 December 2012 (UTC)

Thanks! Now it works better than ever. The Anonymouse (talk • contribs) 01:12, 6 December 2012 (UTC)

The Signpost: 03 December 2012

News and notes: Wiki Loves Monuments announces 2012 winner

Featured content: The play's the thing

Discussion report: Concise Wikipedia; standardize version history tables

Technology report: MediaWiki problems but good news for Toolserver stability

WikiProject report: The White Rose: WikiProject Yorkshire

The Signpost: 10 December 2012

News and notes: Wobbly start to ArbCom election, but turnout beats last year's

Featured content: Wikipedia goes to Hell

Technology report: The new Visual Editor gets a bit more visual

WikiProject report: WikiProject Human Rights

Wtf??

Why did u decline my article?? — Preceding unsigned comment added by 66.87.97.103 (talk) 01:11, 12 December 2012 (UTC)

WikiProject Articles for creation newsletter

Delivered 00:56, 18 December 2012 (UTC) by EdwardsBot. If you do not wish to receive this newsletter, please remove your name from the spamlist.

The Signpost: 17 December 2012

News and notes: Arbitrator election: stewards release the results

WikiProject report: WikiProjekt Computerspiel: Covering Computer Games in Germany

Discussion report: Concise Wikipedia; section headings for navboxes

Op-ed: Finding truth in Sandy Hook

Featured content: Wikipedia's cute ass

Technology report: MediaWiki groups and why you might want to start snuggling newbie editors

The Signpost: 24 December 2012

News and notes: Debates on Meta sparking along—grants, new entities, and conflicts of interest

WikiProject report: A Song of Ice and Fire

Featured content: Battlecruiser operational

Technology report: Efforts to "normalise" Toolserver relations stepped up

The Signpost: 31 December 2012

From the editor: Wikipedia, our Colosseum

In the media: Is the Wikimedia movement too 'cash rich'?

News and notes: Wikimedia Foundation fundraiser a success; Czech parliament releases photographs to chapter

Technology report: Looking back on a year of incremental changes

Discussion report: Image policy and guidelines; resysopping policy

Interview: Interview with Brion Vibber, the WMF's first employee

Featured content: Whoa Nelly! Featured content in review

WikiProject report: New Year, New York

Recent research: Wikipedia and Sandy Hook; SOPA blackout reexamined

The Signpost: 07 January 2013

Op-ed: Meta, where innovative ideas die

WikiProject report: Where Are They Now? Episode IV: A New Year

News and notes: 2012—the big year

Featured content: Featured content in review

Technology report: Looking ahead to 2013

The Signpost: 14 January 2013

Investigative report: Ship ahoy! New travel site finally afloat

News and notes: Launch of annual picture competition, new grant scheme

WikiProject report: Reach for the Stars: WikiProject Astronomy

Discussion report: Flag Manual of Style; accessibility and equality

Special report: Loss of an Internet genius

Featured content: Featured articles: Quality of reviews, quality of writing in 2012

Arbitration report: First arbitration case in almost six months

Technology report: Intermittent outages planned, first Wikidata client deployment

DRN bot

Hi, so at WT:DRN, there is a discussion on making subpages for each case. (Sort of a WP:SPI style) Would you modify the bot if the proposal succeeds (very likely to succeed). If you will, how long would it take?

Copied from Steven Zhang to explain SPI style:

Essentially, how this works, is essentially, instead of each dispute being a thread on the one page, each dispute would have it's own page that is created by the filer, very similar to the format of WP:SPI. When a dispute is closed (as resolved or otherwise), it's archived, and can be easily referred back to if a dispute is filed again. Potential positives with the change is a more organised format, and easier to look back on past discussions. Negatives include the loss of all cases easily viewable on a watchlist, [ ... ], and criticism of increased bureaucracy.

~~Ebe 123~~ → report 14:43, 28 December 2012 (UTC)

I can't give an ETA on that, but I would make an effort to fix the bot since it wouldn't be able to run otherwise. I'd expect it to take no more than a single day of work. For what it's worth, I definitely agree that having subpages would probably result in an easier system overall, and whether or not it is a lot of work for me shouldn't affect whether you are going to make the change. It would be nice if you could let me know when the proposal is about to close so I can have the bot ready by then, but other than that, I don't foresee any problems. — Earwig ^talk 19:30, 28 December 2012 (UTC)

I would be closing the proposal on the 10th of January; 1 month after it was proposed. Thank you for your answer. ~~Ebe 123~~ → report 22:04, 28 December 2012 (UTC)

I've "closed" the discussion as a pass. You might want to see the bottom of the DRN page to see an example of the SPI style. ~~Ebe 123~~ → report 22:43, 11 January 2013 (UTC)

Alright. The bot's not ready yet, but I'll try to get the new code done by tonight (which will be around 05:00 UTC on Jan 14). Note that there will be a period where the bot will only recognize cases that are transluded (new system), excluding ones that are directly on the page (old system). I won't update the bot with the new code until you are ready, which doesn't seem to be the case right now. — Earwig ^talk 19:17, 13 January 2013 (UTC)

Code is written, but untested. I'm ready to update the bot when you are. — Earwig ^talk 03:39, 16 January 2013 (UTC)

We will need to work on restructuring the page before we can go ahead with any changes (I'll get back to you on that). Part of the bot code notifies my WMF account when a thread has been open for more than ten days, can this be commented out of the code for now (made inactive - but not removed) as I don't need it now (and have finished with the WMF) but may do something else with it later. Szhang (WMF) (talk) 13:56, 16 January 2013 (UTC)

Done. Thanks. — Earwig ^talk 23:36, 16 January 2013 (UTC)

Copyvio detector - mirror found

I have found a mirror while using Earwigbot's copyvio detector -[wpedia.goo.ne.jp/]]. Can you add it to the ignore list please? (great tool BTW!) Mdann52 (talk) 14:00, 17 January 2013 (UTC)

Looks like it was already in one of the mirror lists, but the bot missed it due to a bug, which I've now fixed. Thanks for the kind words! — Earwig ^talk 22:38, 17 January 2013 (UTC)

False positive

So, I think there's something wrong with the copyvio detector. I ran [1] and got "Jatin Seth is a suspected violation of wiki.riteme.site/wiki/Jatin_Seth." I'm pretty sure that's not supposed to happen. —Darkwind (talk) 03:24, 19 January 2013 (UTC)

Should be fixed now; actually a bug introduced by the thread above (gah, this is not my week). Sorry about that! — Earwig ^talk 04:53, 19 January 2013 (UTC)

AFC Stats bot is creating templates too large to transclude

See Wikipedia talk:WikiProject Articles for creation/2014 5#Template limit means backlog list doesn't show up in /Submissions for discussion. davidwr/(talk)/(contribs)/(e-mail) 22:37, 21 January 2013 (UTC)

Hi. We've had this problem for years, and there is no real solution other than to review submissions. See 1, 2, 3, 4, 5, 6; there are probably others. — Earwig ^talk 02:10, 22 January 2013 (UTC)

The special page Special:ExpandTemplates may be the solution. It basically turns wiki-code into plain old HTML. I ran a copy of the template through this special page and generated User:Davidwr/sandbox2/AFC statistics special expand templates. User:Davidwr/sandbox2 transcludes this page twice and it seems to work fine. Now the big question: Can you access this function programmatically? Credit to: Wikipedia:Template limits#Special:Expandtemplates. davidwr/(talk)/(contribs)/(e-mail) 04:16, 22 January 2013 (UTC)

Well, there's a reason I use sub-templates in the statistics template, which is that it would be far too large otherwise. User:Davidwr/sandbox2/AFC statistics special expand templates is ~587 kB when based off of a ~152 kB page. Sometimes {{AFC statistics}} will reach over 400 kB, which translates to over 1,500 kB when expanded completely. I'm not even sure if MediaWiki will let us save a page that large. Granted, you do present a solution, but I'm not sure if it's a good idea to implement it. — Earwig ^talk 04:43, 22 January 2013 (UTC)

Both User:Davidwr/sandbox2/AFC statistics special expand templates and the current (as of 06:25, 22 January 2013 (UTC)) version of {{AFC statistics}} deliver about 900KB of HTML to the browser. I tried saving a file that had 4 copies of User:Davidwr/sandbox2/AFC statistics special expand templates. It previewed okay except I got the following error during the preview:

ERROR: The text you have submitted is 2,292 kilobytes long, which is longer than the maximum of 2,000 kilobytes. It cannot be saved.

So, as long as things stay under 2,000 KB, you won't break the Wiki. davidwr/(talk)/(contribs)/(e-mail) 06:25, 22 January 2013 (UTC)

But I will be using up a good deal more server space than I am currently, which is what bothers me. The template is still functional if you view it from {{AFC statistics}}; the problem only occurs if you try to transclude it. I recognize there is a loss in functionality if you have to physically go to the chart to view it, but is gaining that functionality back under certain circumstances worth the 3x disk space usage? — Earwig ^talk 06:40, 22 January 2013 (UTC)

That is a question that depends on how "expensive" disk space is to the project. I've been on projects where "You want to use 4MB more just for THAT? We are short of disk space and our backup schedule is tight as it is! NO!" was a reality, and on projects where we had terabytes coming out the wazoo and backup and other costs for a 4MB file would be lost in the noise. OK, I'm exaggerating a bit in both cases but you get the idea. If the foundation is pressing you to conserve space, then this is a feature that can be sacrificed.

On the other hand, the server operators will get the benefit of reduced CPU usage since the page won't have to be reparsed as much. davidwr/(talk)/(contribs)/(e-mail) 15:38, 22 January 2013 (UTC)

All good points. I guess I'm nervous because I have experience with using too much disk usage. We're looking at maybe 900 MB/month near the more extreme end of the spectrum. — Earwig ^talk 18:45, 22 January 2013 (UTC)

Ouch. If the person who turned off the bot was concerned because of overall disk I/O then that may still be an issue today. If they were concerned because you were using 900MB at one time for logs, that won't be a problem here. davidwr/(talk)/(contribs)/(e-mail) 18:51, 22 January 2013 (UTC)

The Signpost: 21 January 2013

News and notes: Requests for adminship reform moves forward

WikiProject report: Say What? — WikiProject Linguistics

Featured content: Wazzup, G? Delegates and featured topics in review

Arbitration report: Doncram case continues

Technology report: Data centre switchover a tentative success

SWMT tool

Hi. I know I have a global account, but your tool says otherwise. Also, if I fail to enter a username (which it claims is not required), it crashes completely. Just checking if you were aware of these issues (or if the tool is supposed to work at all!) Mdann52 (talk) 13:21, 29 January 2013 (UTC)

Yeah, honestly, this isn't functional at the moment. I haven't touched it in something like two and a half years and it's pretty low-priority right now. You can tell because it's still using the old monobook interface that my newer tools don't/won't have. — Earwig ^talk 20:30, 29 January 2013 (UTC)

The Signpost: 28 January 2013

In the media: Hoaxes draw media attention

Recent research: Lessons from the research literature on open collaboration; clicks on featured articles; credibility heuristics

WikiProject report: Checkmate! — WikiProject Chess

Discussion report: Administrator conduct and requests

News and notes: Khan Academy's Smarthistory and Wikipedia collaborate

Featured content: Listing off progress from 2012

Arbitration report: Doncram continues

Technology report: Developers get ready for FOSDEM amid caching problems

The Signpost: 04 February 2013

Special report: Examining the popularity of Wikipedia articles

News and notes: Article Feedback Tool faces community resistance

WikiProject report: Land of the Midnight Sun

Featured content: Portal people on potent potables and portable potholes

In the media: Star Trek Into Pedantry

Technology report: Wikidata team targets English Wikipedia deployment

AFC Stats bot stopped?

I've noticed the bot hasn't updated the AFC statistics page in two days. Was wondering if this was done on purpose or it's just that it can't handle the backlog. Funny ^Pika! 03:11, 9 February 2013 (UTC)

Hi, thanks for the note. It looks like something broke, actually. Not entirely sure what happened. I've restarted the bot, which should hopefully fix it. — Earwig ^talk 13:17, 9 February 2013 (UTC)

mwparserfromhell's README

Hi. I just wanted to drop you a quick note to say that I read mwparserfromhell's README the other day and found it to be some of the best documentation I'd ever read. Very nicely done. :-) --MZMcBride (talk) 04:31, 13 February 2013 (UTC)

Hey, thanks! That's quite a compliment, especially coming from someone like yourself. — Earwig ^talk 04:41, 13 February 2013 (UTC)

The Signpost: 11 February 2013

Op-ed: An article is a construct – hoaxes and Wikipedia

Featured content: A lousy week

WikiProject report: Just the Facts

In the media: Wikipedia mirroring life in island ownership dispute

News and notes: UK chapter governance review marks the end of a controversial year

Discussion report: WebCite proposal

Technology report: Wikidata client rollout stutters

Problem with Template:AFC statistics

Hello! In case you were unaware, the Template:AFC statistics/sandbox hasn't been updating for some time. The Template:AFC statistics seems broken; since they are connected, maybe it's all part of the same problem. Thanks for creating these lists; I find them very helpful. —Anne Delong (talk) 13:18, 18 February 2013 (UTC)

I'm confused. I didn't know what Template:AFC statistics/sandbox was until right now, and it looks like the bot has never had a role in updating it. If you mean Template:AFC statistics is broken because it has a lot of broken templates listed at the top, that's simply because we're flooded by pending submissions right now and MediaWiki can't display all of them. However, it looks like the bot is still updating it fine. Best solution is to review as much as possible. — Earwig ^talk 16:05, 18 February 2013 (UTC)

The Signpost: 18 February 2013

WikiProject report: Thank you for flying WikiProject Airlines

Technology report: Better templates and 3D buildings

News and notes: Wikimedia Foundation declares 'victory' in Wikivoyage lawsuit

In the media: Sue Gardner interviewed by the Australian press

Featured content: Featured content gets schooled

Bot adding collapse templates at dispute resolution noticeboard

I've just removed an extra template:DRN archive top from a closed thread at WP:DRN. This is the second one that's turned up recently. After a little look through the page history, I discovered the culprit: EarwigBot! ([2] and [3].) It seems to only do it when the 'do not archive until' comment isn't removed, so it can be prevented by always remembering to remove the comment when closing, but the bot clearly isn't working properly as no bottom template is added and even if it was, the template is being added to already collapsed threads, making it pointless. CarrieVS (talk) 15:04, 21 February 2013 (UTC)

Found the culprit; should be fixed. — Earwig ^talk 23:21, 21 February 2013 (UTC)

Thanks. CarrieVS (talk) 19:04, 22 February 2013 (UTC)

Thomas E. Emerson

Hi Earwig:

I am asking what I (actually my student) needs to do to improve Thomas E. Emerson's Wikipedia page that was declined recently. I see the request for independent references is one thing. http://wiki.riteme.site/wiki/Wikipedia_talk:Articles_for_creation/Thomas_E._Emerson

Dr. Emerson is one of the most famous archaeologists doing Eastern North American prehistory, the author of numerous well received books, edited volumes, and papers. I am more than willing to edit the page and organize it better, and am wondering how to proceed to meet your concerns about independent references. In our field we have book reviews that are published in academic journals, and some of those could be cited. The books themselves that he wrote are published, and could also be referenced.

FYI, I am a professor of archaeology at the University of Tennessee, and I have my students write articles for Wikipedia on a famous archaeologist, archaeological site, or archaeological project every time I teach an advanced undergraduate class in North American archaeology, provided there is no previously published article on the subject on Wikipedia.

I am not as proficient at Wikipedia as I could be, but believe it is a critically important reference tool, which is why I support it and try to build its intellectual content. My students have posted >100 articles over the past 5 years. I am always puzzled why some articles go up with little or no comment, while others have more problems. I appreciate what you all do, and just want to learn to do it better.

Feel free to email me back if you want... dander19@utk.edu

Thanks! 160.36.65.208 (talk) 21:16, 24 February 2013 (UTC) David

David G. Anderson, Ph.D., RPA Professor and Associate Head Department of Anthropology The University of Tennessee 250 South Stadium Hall Knoxville, Tennessee 37996-0720 dander19@utk.edu http://web.utk.edu/~anthrop/faculty/anderson.html http://pidba.tennessee.edu/ http://bellsbend.pidba.org/

Hi. As you mention, citing book reviews from academic journals is a good way to prove notability. Think of it this way: you might say that he is "one of the most famous archaeologists doing Eastern North American prehistory", but how can readers believe that claim? Examples of him and his work being discussed by other individuals who are unconnected with him (independent, i.e., they do not have any personal incentive for him to be successful) are a great way to prove the notability of a subject. I have not given the article text itself a thorough review due to that issue, but it seems good from my cursory look. The "Selected Papers" section, however, could be shortened a lot to focus on the most significant papers. Web links if anything is online is also helpful, but not required. — Earwig ^talk 21:54, 24 February 2013 (UTC)

Wikiproject Articles for creation Needs You!

WikiProject Articles for creation Backlog Elimination Drive

WikiProject AFC is holding a one month long Backlog Elimination Drive!
The goal of this drive is to eliminate the backlog of unreviewed articles. The drive is running from March 1st, 2013 – March 31st, 2013.

Awards will be given out for all reviewers participating in the drive in the form of barnstars at the end of the drive.
There is a backlog of over 2000 articles, so start reviewing articles! Visit the drive's page and help out!

Delivered by User:EdwardsBot on behalf of Wikiproject Articles for Creation at 13:54, 27 February 2013 (UTC)

The Signpost: 25 February 2013

In the media: Ex-WMF trustee creates "Wikipedia Corporate Index" for PR agency

Recent research: Wikipedia not so novel after all, except to UK university lecturers

News and notes: "Very lucky" Picture of the Year

Discussion report: Wikivoyage links; overcategorization

Featured content: Blue birds be bouncin'

WikiProject report: How to measure a WikiProject's workload

Technology report: Wikidata development to be continued indefinitely

Peter Tiboris

March 2, 2013

Hello Earwig,

I am writing to you because you have given me pointers about how to improve an article about the conductor and producer Peter Tiboris. I took your advice, reworked the article, and resubmitted it about three weeks ago. The article has been rejected again, by an admittedy inexperienced editor, due to lack of evidence of the subject's notability. This is the exact same rejection form used previously. Before resubmitting the article, I added numerous citations of articles and reviews that appeared in <The New York Times/>, <The New Yorker/>, and other well-known and verifiable sources.

When I click on the link to edit the rejected article, I am taken to an article about a rap singer. I don't quite understand this connection. Mr. Tiboris is well known in the music industry as a classical music conductor and a producer of concerts. He has conducted orchestras in 20 countries and produced 1200 concerts throughout the world, primarily in New York's Carnegie Hall, over a 30-year period.

I don't know what to do next which is why I am writing to you.

Many thanks for your suggestions about Article: Peter Tiboris.

Sincerely,

Dale ZeidmanDzeidman (talk) 01:03, 3 March 2013 (UTC)

The Signpost: 04 March 2013

Op-ed: We must do more to turn readers into editors

News and notes: Outing of editor causes firestorm

Featured content: Slow week for featured content

WikiProject report: WikiProject Television Stations

The Signpost: 11 March 2013

From the editor: Signpost–Wikizine merger

News and notes: Finance committee updates

Featured content: Batman, three birds and a Mercedes

Arbitration report: Doncram case closes; arbitrator resigns

WikiProject report: Setting a precedent

Technology report: Article Feedback reversal

mwparserfromhell doesn't seem to recognize ref tags?

Hello, you seem to be responsible for mwparserfromhell development, so hope you don't mind me asking: I posted a question at WP:BON regarding parsing out ref tags with mwparserfromhell v. 0.1.1. Doesn't seem to be working for me, should I be expecting it to? Appreciate your input, cheers... Zad68 15:49, 13 March 2013 (UTC)

Hi! Yes, support for <ref> tags is not in version 0.1.1, but it will be in 0.2 when that comes out. If you need it right now and have some knowledge of git, you can clone mwparserfromhell's feature/html_tags branch, which has support for tags like <ref> (although it's a bit buggy, so I wouldn't trust it). You would also need to explicitly use the Python tokenizer instead of the C extension, which is used by default if you install the library normally and are on Python 2. If you need further pointers doing any of those things, I can provide – just be more specific. As for an ETA on the finished code, I can't make any promises, but it should be done this month. — Earwig ^talk 21:49, 13 March 2013 (UTC)

Thanks very much, both for the answer, and for your development of the libraries. Instead of me using the 0.2 libraries, in the short term, can you give me a re.compile() pattern for the ref tags? That's all I'm looking to extract at this point. I'd be happy to help you test or debug the 0.2 libraries if you're looking for such help, let me know. Cheers... Zad68 00:15, 14 March 2013 (UTC)

I'm not a huge fan of regex (the parser doesn't use it), and it's pretty awful for parsing wikitext in general, so your guess is as good as mine for that. Meanwhile, your help testing the library in the future would be very appreciated. — Earwig ^talk 00:38, 14 March 2013 (UTC)

OK then I'll bite, what do you use to help parse if not regexes? I use them all the time and find them useful. OK I'll try to figure out how to use git to get 0.2, there's no better way to figure out how useful something is than using it. I'm going to build a GA review tool and parse a bunch of articles. Thanks... Zad68 01:02, 14 March 2013 (UTC)

The parser works by using a tokenizer that converts the wikicode string into a series of tokens, and then builds the tokens into data structures that are easy to manipulate. So {{foo|bar}} gets converted into the tokens [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateClose()], which then gets converted into a Template object with the data stored within it. The series of objects that make up the wikicode are then wrapped in a Wikicode object, which has methods like filter_templates(). The entire process is a lot more complex than just regex, because regex is prone to catastrophic failures when the input is not exactly what it expects it to be, whereas a tokenizer can handle confusing cases, like nested templates and wikicode that looks like a template but actually isn't because there's an invalid character in the template's name. The regex necessary for that to work properly would be far too complex, and probably impossible. — Earwig ^talk 01:38, 14 March 2013 (UTC)

I can see you've made two terrible errors here, 1) you know what you're doing, and 2) you allowed me to find that out by being responsive to my questions!

Hope you don't mind if I keep hitting you up for answers here and there. Say, what might you know about the reporting side of the Wiki bot world? I'd love to find out more about how User:Mr.Z-man's project reports work, and he's so awfully busy that he doesn't often answer questions. I recently got my Tool Server account set up, one of the things on my "To Figure Out Eventually" list was how to duplicate and expand some of his project-level reporting options. Zad68 02:37, 14 March 2013 (UTC)

Heh, thanks. I'm happy to answer questions whenever I'm not swamped with other work. As for Mr.Z-man's project reports, I don't know what to say since I don't know anything about them. Your best bet would be to show me an example of what he has running and what you want to do with it. — Earwig ^talk 02:56, 14 March 2013 (UTC)

So after installing git, the Python-dev libraries (needed for Python.h because it looks like it's doing some C compiling), and doing a little reading, I got as far as getting the dev mwparserfromhell lib downloaded, installed locally and built, and it looks like I'm using it, but processing behavior is same as before, the ref tags aren't parsed out:

Extended content

  $ git clone -b feature/html_tags git://github.com/earwig/mwparserfromhell.git
  ...
  $ python setup.py install --user
  running install
  running bdist_egg
  running egg_info
  ...
  Adding mwparserfromhell 0.2.dev to easy-install.pth file
 
  Installed /home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg
  Processing dependencies for mwparserfromhell==0.2.dev
  Finished processing dependencies for mwparserfromhell==0.2.dev
 
  $ python
  Python 2.7.3 (default, Aug  1 2012, 05:14:39)
  [GCC 4.6.3] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import mwparserfromhell
  >>> mwparserfromhell.__version__
  u'0.2.dev'
  >>> mwparserfromhell.__file__
  '/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/__init__.pyc'
  >>> text = "I has a template!<ref>{{foo|bar|baz|eggs=spam}}</ref> See it?"
  >>> wikicode = mwparserfromhell.parse(text)
  >>> wikicode.filter_templates()
  [u'{{foo|bar|baz|eggs=spam}}']
  >>> wikicode.filter_tags()
  []
  >>> wikicode.filter_text()
  [u'I has a template!<ref>', u'</ref> See it?']

Is there some development flag or something I have to enable or something to get it to parse the tags? Hope you don't mind me stinking up your Talk page with this, if you'd rather do this somewhere else let me know. Also my email is enabled, that's good for me too. Any help/direction appreciated, cheers... Zad68 03:27, 14 March 2013 (UTC) Maybe this has something to do with where you said You would also need to explicitly use the Python tokenizer instead of the C extension but I'm not sure how to do that. Zad68 03:34, 14 March 2013 (UTC)

Yup, that's why it's not working. I just realized that I haven't given an easy way to force the Python tokenizer to be used, so try this (temporarily – I'll try to get on something easier tomorrow, since it's already midnight):

>>> import mwparserfromhell
>>> from mwparserfromhell.parser.tokenizer import Tokenizer
>>> from mwparserfromhell.parser.builder import Builder
>>> text = "I has a template!<ref>{{foo|bar|baz|eggs=spam}}</ref> See it?"
>>> wikicode = Builder().build(Tokenizer().tokenize(text))

>>> wikicode.filter_templates(recursive=True)  # 'recursive' needed because template is nested inside tag
[u'{{foo|bar|baz|eggs=spam}}']
>>> wikicode.filter_tags()
[u'<ref>{{foo|bar|baz|eggs=spam}}</ref>']
>>> tag = wikicode.filter_tags()[0]
>>> tag.tag
u'ref'
>>> tag.type == tag.TAG_REF
True
>>> tag.contents
u'{{foo|bar|baz|eggs=spam}}'
>>> tag.contents.filter_templates()[0].name
u'foo'
>>> tag.contents.filter_templates()[0].params
[u'bar', u'baz', u'eggs=spam']

Hope that helps. Oh, and my talk page is fine for this sort of discussion. — Earwig ^talk 03:59, 14 March 2013 (UTC)

Yeah baby that's the stuff! Works now, thanks, off to build my GA tool, cheers... Zad68 12:56, 14 March 2013 (UTC)

You should now be able to do just this (after git pull-ing the repository):

>>> import mwparserfromhell
>>> mwparserfromhell.parser.use_c = False
>>> wikicode = mwparserfromhell.parse(text)

— Earwig ^talk 23:18, 14 March 2013 (UTC)

Works great, thanks! Zad68 03:34, 15 March 2013 (UTC)

mwparserfromhell 0.2-dev bug: doesn't parse ref tag name parameters with double-quotes and hyphen

Hey Earwig, try testing this:

  wikicode = Builder().build(Tokenizer().tokenize('<ref name="a-b">'))

Error I get is: in my local tokenizer.py, line 472, in _actually_close_tag_opening:

  if isinstance(self._stack[-1], tokens.TagAttrStart):
  IndexError: list index out of range

Only seems to occur when: 1) It's a ref tag, 2) name parameter is specified and has a value with certain characters in it, like - (hyphen) or = (equals), 3) the name parameter value is in double-quote. Bug? Zad68 14:10, 14 March 2013 (UTC)

Yup, okay. I'll look into this. — Earwig ^talk 23:16, 14 March 2013 (UTC)

mwparserfromhell 0.2-dev bug: self-closing tags not handled properly

Self-closing tags don't seem to be handled properly:

Extended content

  $ python
  Python 2.7.3 (default, Aug  1 2012, 05:14:39)
  [GCC 4.6.3] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import mwparserfromhell
  from mwparserfromhell.parser.tokenizer import Tokenizer
  from mwparserfromhell.parser.builder import Builder
  >>> >>> >>>
 
  # Without self-closing ref tag, works
  >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref>'))
  >>> wikicode.filter_tags()
  [u'<ref name=foo>{{bar}}</ref>']
  >>> wikicode.filter_tags(recursive=True)
  [u'<ref name=foo>{{bar}}</ref>']
 
  # With self-closing tag, doesn't work
  >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref><ref name=baz/>'))
  >>> wikicode.filter_tags()
  []
  >>> wikicode.filter_text()
  [u'baz']
  >>> wikicode.filter_tags(recursive=True)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 376, in filter_tags
      return list(self.ifilter_tags(recursive, matches, flags))
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 301, in ifilter
      for node in nodes:
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 82, in _get_all_nodes
      for child in self._get_children(node):
    File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 59, in _get_children
      for context, child in node.__iternodes__(self._get_all_nodes):
  AttributeError: 'NoneType' object has no attribute '__iternodes__'
 
  # Edge case with self-closing tag only:
  >>> wikicode = Builder().build(Tokenizer().tokenize('<ref name=foo/>'))
  >>> wikicode.filter_tags()
  []
  >>> wikicode.filter_text()
  [u'foo']
 
  # If the tag isn't "ref", different but still incorrect behavior:
  # it doesn't stack trace but doesn't work either...
  >>> wikicode = Builder().build(Tokenizer().tokenize('I has<bloop name=baz/> a template!'))
  >>> wikicode.filter_tags()
  []
  >>> wikicode.filter_tags(recursive=True)
  []
  >>>

Any questions let me know... Zad68 16:08, 14 March 2013 (UTC)

I'll look into this one too. — Earwig ^talk 23:16, 14 March 2013 (UTC)

Adding: Probably related (if not identical), but also look at the handling of this sort of embedded self-closing tag inside a tag:

  wikicode = Builder().build(Tokenizer().tokenize("==Epidemiology==\nFoo.<ref>hi<br />there</ref>"))
  # this looks OK:
  >>> wikicode.filter_tags()
  [u'<ref>hi<br />there</ref>']
  # but doing it recursively yields slightly different stack trace
  >>> wikicode.filter_tags(recursive=True)
  Traceback (most recent call last):
  ...
  AttributeError: 'NoneType' object has no attribute 'nodes'

Thanks... Zad68 03:13, 15 March 2013 (UTC)

mwparserfromhell 0.2-dev bug: doesn't seem to parse out sections

Check out:

  >>> text = 'I has a template!\nfoo\n==bar==\n===baz===\nend'
  >>> wikicode = Builder().build(Tokenizer().tokenize(text))
  >>> wikicode.get_sections()
  [u'I has a template!\nfoo\n', u'==bar==\n===baz===\nend', u'===baz===\nend']

Is that what I should be expecting? Cheers... Zad68 17:00, 14 March 2013 (UTC)

This looks right; I'm not sure what you were expecting differently. The first is the lead section, the second is the level-two section starting with "bar" (including its child, "baz"), and the third is the level-three section starting with "baz". Check the documentation for get_sections() to see the various parameters. — Earwig ^talk 23:15, 14 March 2013 (UTC)

OK my misunderstanding then. What I'd like is a list of just the section names and not the text. Is that possible with the API? Zad68 23:21, 14 March 2013 (UTC)

Just use filter() with the forcetype parameter, like...

>>> text = 'I has a template!\nfoo\n==bar==\n===baz===\nend'
>>> wikicode = mwparserfromhell.parse(text)
>>> wikicode.filter(forcetype=mwparserfromhell.nodes.Heading)
[u'==bar==', u'===baz===']

filter_templates() (and friends) are just wrappers around, e.g. filter(forcetype=mwparserfromhell.nodes.Template). — Earwig ^talk 23:34, 14 March 2013 (UTC)

Works like a champ! Zad68 03:13, 15 March 2013 (UTC)

Template:AFC statistics

The AFC statistics template the bot makes is so big that it is no longer usable.

In addition to the main template your bot generates, can you have the bot generate a version that is broken into parts, each no bigger than a few hundred entries? My personal preference would be one part for each section except "pending" and break "pending" up by day or week, with one part per day or week. To keep those who like to work on the biggest or smallest submissions, a separate page showing the 100 biggest and 100 smallest submissions would be useful.

The idea is that when the backlog is small, we can use the whole template, but when it is large, we can use the various parts. davidwr/(talk)/(contribs)/(e-mail) 02:47, 27 February 2013 (UTC)

This is a fair point. I can't look at this right now, but I don't want to have the bot updating multiple pages since that would be difficult to manage and would produce more edits than I'd be happy with. A possible solution may be to limit its total output to the "top" X submissions – perhaps something like the 150 oldest, 150 newest, 150 smallest, and 150 largest? (600 should be a fair total amount, but I might want to go lower). I can also have a page on the Toolserver render out the entire chart if I don't save it on-wiki. — Earwig ^talk 04:03, 27 February 2013 (UTC)

I definitely like the toolserver option to hold the whole list. I would recommend that the "top nnn" be only for those that are pending and awaiting review, and not include those that are in a draft state. 150 x 4 may be too many if you want to transclude it. Experiment, if 600 is too many, maybe 500 or 400 would work without "breaking the wiki." davidwr/(talk)/(contribs)/(e-mail) 04:39, 27 February 2013 (UTC)

Note, the 150 oldest, 150 newest, 150 smallest, and 150 largest could be any number between 300 and 600. Martijn Hoekstra (talk) 07:06, 27 February 2013 (UTC)

Yes. I'll look into this more carefully; it shouldn't be too hard, but I don't think I'll have anything final before next week. — Earwig ^talk 22:56, 27 February 2013 (UTC)

I've created some new categories that split the backlog up by day. This should help the backlog drive while waiting on a usable list view. See the talk page for details. davidwr/(talk)/(contribs)/(e-mail) 01:09, 1 March 2013 (UTC)

Can we split the template? JayJay^{What did I do?} 16:53, 1 March 2013 (UTC)

Hi Earwig, I've converted some of the AFC statistics data to LUA templates, and it seems to work. A few caveats:

It doesn't do everything yet, notably the diffs on the timestamps.
Some misplaced articles come out broken (easy to be fixed)
The page (1.9 MB) takes 33 seconds to load on my machine. This could be considered unacceptable.

On the plus side, it does render the entire page, without the brokenness the current template displays. Might this be some way forward? You can find my testcase on User:Martijn_Hoekstra/templatetest (which might, as I said, take some time to load). (note to lurkers, this was created manually, and will NOT be updated as new articles are added/reviewed) Martijn Hoekstra (talk) 11:10, 18 March 2013 (UTC)

Ooooh, this I like a lot. I had completely forgotten about Lua. The rendering time seems to be fine on my machine, and it looks like a very simple code adjustment for the bot. I'm assuming you're going to work further on this and iron out the missing things? I'd like to work on making this live. — Earwig ^talk 23:50, 18 March 2013 (UTC)

If you want you can switch right away. The short title parameter is no longer required, the template calculates that itself, which makes up for the slightly longer template invocation. It might exhibit slight brokenness from time to time while I work on it, but that should be no more than an hour of brokenness a day, for a maximum of 5 days (but a total of two hours of brokenness, and not broken for longer than 10 minutes at a time is closer to my expectation). I haven't tested replacing the content of {{AfC statistics}} with a call to the Lua template, but I think that would run right into the same problems. The misplaces articles come out broken bug is fixed by the way. Martijn Hoekstra (talk) 10:17, 19 March 2013 (UTC)

Show you what I'm working on with the libraries...

Hello Earwig, just wanted to show you what my (first) goal is in using the mwparserfromhell libraries. The intent of the bot I'm building is to assist me, and any other Wikipedia editor who finds it useful, in getting a jump-start on doing GA reviews, and especially GA reviews of medical articles. As an example of what it'd look like, I ran my bot on Alzheimer's disease (after some massaging to work around the few bugs mentioned above), and the output looks like this. It pulls the Level-2 and Level-3 section headings because I like to make GA review notes section-by-section as I go through the article. It also uses the ref-tag processing to pull all the refs in the article into a Sources table for review. (I like to actually go through every source, verify it's WP:RS, and do a lot of checking that the source is used properly.) As an additional helper, it uses the template processing to identify all the [v]cite journal templates, pulls the PMID for each one, and then goes to PubMed to pull the article's type and put it in the table - for medical articles we really insist on secondary sources like review articles and meta-analyses. The bot even handles the case where a single ref is bundled and has multiple journal templates with PMIDs. Just wanted to share, maybe solicit suggestions, and ... well, it's be great to get the issues fixed; when they are, I'm celebrating the pony way. Appreciate all you're doing with the libraries... Zad68 03:51, 19 March 2013 (UTC)

Template:AFC statistics

The AFC statistics template the bot makes is so big that it is no longer usable.

In addition to the main template your bot generates, can you have the bot generate a version that is broken into parts, each no bigger than a few hundred entries? My personal preference would be one part for each section except "pending" and break "pending" up by day or week, with one part per day or week. To keep those who like to work on the biggest or smallest submissions, a separate page showing the 100 biggest and 100 smallest submissions would be useful.

The idea is that when the backlog is small, we can use the whole template, but when it is large, we can use the various parts. davidwr/(talk)/(contribs)/(e-mail) 02:47, 27 February 2013 (UTC)

This is a fair point. I can't look at this right now, but I don't want to have the bot updating multiple pages since that would be difficult to manage and would produce more edits than I'd be happy with. A possible solution may be to limit its total output to the "top" X submissions – perhaps something like the 150 oldest, 150 newest, 150 smallest, and 150 largest? (600 should be a fair total amount, but I might want to go lower). I can also have a page on the Toolserver render out the entire chart if I don't save it on-wiki. — Earwig ^talk 04:03, 27 February 2013 (UTC)

I definitely like the toolserver option to hold the whole list. I would recommend that the "top nnn" be only for those that are pending and awaiting review, and not include those that are in a draft state. 150 x 4 may be too many if you want to transclude it. Experiment, if 600 is too many, maybe 500 or 400 would work without "breaking the wiki." davidwr/(talk)/(contribs)/(e-mail) 04:39, 27 February 2013 (UTC)

Note, the 150 oldest, 150 newest, 150 smallest, and 150 largest could be any number between 300 and 600. Martijn Hoekstra (talk) 07:06, 27 February 2013 (UTC)

Yes. I'll look into this more carefully; it shouldn't be too hard, but I don't think I'll have anything final before next week. — Earwig ^talk 22:56, 27 February 2013 (UTC)

I've created some new categories that split the backlog up by day. This should help the backlog drive while waiting on a usable list view. See the talk page for details. davidwr/(talk)/(contribs)/(e-mail) 01:09, 1 March 2013 (UTC)

Can we split the template? JayJay^{What did I do?} 16:53, 1 March 2013 (UTC)

Hi Earwig, I've converted some of the AFC statistics data to LUA templates, and it seems to work. A few caveats:

It doesn't do everything yet, notably the diffs on the timestamps.
Some misplaced articles come out broken (easy to be fixed)
The page (1.9 MB) takes 33 seconds to load on my machine. This could be considered unacceptable.

On the plus side, it does render the entire page, without the brokenness the current template displays. Might this be some way forward? You can find my testcase on User:Martijn_Hoekstra/templatetest (which might, as I said, take some time to load). (note to lurkers, this was created manually, and will NOT be updated as new articles are added/reviewed) Martijn Hoekstra (talk) 11:10, 18 March 2013 (UTC)

Ooooh, this I like a lot. I had completely forgotten about Lua. The rendering time seems to be fine on my machine, and it looks like a very simple code adjustment for the bot. I'm assuming you're going to work further on this and iron out the missing things? I'd like to work on making this live. — Earwig ^talk 23:50, 18 March 2013 (UTC)

If you want you can switch right away. The short title parameter is no longer required, the template calculates that itself, which makes up for the slightly longer template invocation. It might exhibit slight brokenness from time to time while I work on it, but that should be no more than an hour of brokenness a day, for a maximum of 5 days (but a total of two hours of brokenness, and not broken for longer than 10 minutes at a time is closer to my expectation). I haven't tested replacing the content of {{AfC statistics}} with a call to the Lua template, but I think that would run right into the same problems. The misplaces articles come out broken bug is fixed by the way. Martijn Hoekstra (talk) 10:17, 19 March 2013 (UTC)

Show you what I'm working on with the libraries...

Hello Earwig, just wanted to show you what my (first) goal is in using the mwparserfromhell libraries. The intent of the bot I'm building is to assist me, and any other Wikipedia editor who finds it useful, in getting a jump-start on doing GA reviews, and especially GA reviews of medical articles. As an example of what it'd look like, I ran my bot on Alzheimer's disease (after some massaging to work around the few bugs mentioned above), and the output looks like this. It pulls the Level-2 and Level-3 section headings because I like to make GA review notes section-by-section as I go through the article. It also uses the ref-tag processing to pull all the refs in the article into a Sources table for review. (I like to actually go through every source, verify it's WP:RS, and do a lot of checking that the source is used properly.) As an additional helper, it uses the template processing to identify all the [v]cite journal templates, pulls the PMID for each one, and then goes to PubMed to pull the article's type and put it in the table - for medical articles we really insist on secondary sources like review articles and meta-analyses. The bot even handles the case where a single ref is bundled and has multiple journal templates with PMIDs. Just wanted to share, maybe solicit suggestions, and ... well, it's be great to get the issues fixed; when they are, I'm celebrating the pony way. Appreciate all you're doing with the libraries... Zad68 03:51, 19 March 2013 (UTC)

EarwigBot Task 2?

I am writing because EarwigBot has not run task 2 (Articles for Creation statistics page/dashboard) In 3 days (14:22, 18 March 2013‎). AfC is currently in a backlog elimination drive and having to sort through already reviewed submissions that are not coming off the list is exceptionally annoying. If you could do a manual run of task 2 and verify that it successfully completes, that would be greatly appreciated. Hasteur (talk) 13:30, 21 March 2013 (UTC)

Hi. There's currently a bit of bad lag affecting the Toolserver now, so automatic updating has been disabled. I'm running a manual update now, which I'll try to do semi-periodically, but obviously the list will be a bit behind until the databases catch up. There's nothing I can do about that, other than wait. — Earwig ^talk 01:31, 22 March 2013 (UTC)

The Signpost: 18 March 2013

News and notes: Resigning arbitrator slams Committee

WikiProject report: Making music

Interview: Meeting in the middle: Wikipedia and libraries

Featured content: Wikipedia stays warm

Arbitration report: Richard case closes

Technology report: Visual Editor "on schedule"

Human vulnerability to climate change in the Caribbean

Good morning!

I am need to posting page and am concerned that a page I am working on is about to be deleted. Any suggestions? Avewiki (talk) 13:39, 27 March 2013 (UTC)

(talk page stalker) I've offered some suggestions at User talk:Avewiki which, if followed, will help no end. Fiddle Faddle (talk) 13:59, 27 March 2013 (UTC)

The Signpost: 25 March 2013

WikiProject report: The 'Burgh: WikiProject Pittsburgh

Featured content: One and a half soursops

Arbitration report: Two open cases

News and notes: Sue Gardner to leave WMF; German Wikipedians spearhead another effort to close Wikinews

Technology report: The Visual Editor: Where are we now, and where are we headed?

Recent research: "Ignore all rules" in deletions; anonymity and groupthink; how readers react when shown talk pages

AFC Backlog

Articles for Creation urgently needs YOUR help!

Articles for Creation is desperately short of reviewers! We are looking for urgent help, from experienced editors, in reviewing submissions in the pending submissions queue. Currently there are 2258 submissions waiting to be reviewed and many help requests at our Help Desk.

Do you have what it takes?

Are you familiar with Wikipedia's policies and guidelines?
Do you know what Wikipedia is and is not?
Do you have a working knowledge of the Manual of Style, particularly article naming conventions?
Are you autoconfirmed?
Can you review submissions based on their individual merits?

If the answer to these questions is yes, then please read the reviewing instructions and donate a little of your time to helping tackle the backlog. You might wish to add {{AFC status}} or {{AfC Defcon}} to your userpage, which will alert you to the number of open submissions.

PS: we have a great AFC helper script at User:Timotheus Canens/afchelper4.js which helps in reviewing in just few edits easily!

We would greatly appreciate your help. Currently, only a small handful of users are reviewing articles. Any help, even if it's just 2 or 3 reviews, it would be extremely beneficial.
On behalf of the Articles for Creation project,
TheSpecialUser ^TSU

(comment to make MiszaBot archive this) — Earwig ^talk 03:18, 4 April 2013 (UTC)

The Signpost: 01 April 2013

Special report: Who reads which Wikipedia?

WikiProject report: Special: FAQs

Featured content: What the ?

News and notes: Grants given for Wikipedia Library, six others; April Fool's Day ructions

Arbitration report: Three open cases

Technology report: Wikidata phase 2 deployment timetable in doubt

The Signpost: 08 April 2013

Wikizine: WMF scales back feature after outcry

WikiProject report: Earthshattering WikiProject Earthquakes

News and notes: French intelligence agents threaten Wikimedia volunteer

Arbitration report: Subject experts needed for Argentine History

Featured content: Wikipedia loves poetry

Technology report: Testing week

The Signpost: 15 April 2013

Op-ed: How do we fix RfA inactivity?

WikiProject report: Unity in Diversity: South Africa

News and notes: Another admin reform attempt flops

Featured content: The featured process swings into high gear

The Signpost: 22 April 2013

In the media: Wikipedia inaccurate, says Florence; New Wikipedia app for breaking news

WikiProject report: WikiProject Editor Retention

News and notes: Milan conference a mixed bag

Featured content: Batfish in the Red Sea

Arbitration report: Sexology case nears closure after stalling over topic ban

Technology report: A flurry of deployments

Renaming templates

Hi Earwig--is it possible to rename a template using mwparserfromhell? I thought one could just use template.name('newname'), but apparently that's not the case. —Theopolisme (talk) 11:19, 29 April 2013 (UTC)

.name is not a function, it's an attribute, so you set it instead of calling it. Try template.name = 'newname'. — Earwig ^talk 21:38, 29 April 2013 (UTC)

Facepalm . Theopolisme (talk) 21:48, 29 April 2013 (UTC)

Heh, glad I could help! No worries. — Earwig ^talk 21:54, 29 April 2013 (UTC)

The Signpost: 29 April 2013

News and notes: Chapter furore over FDC knockbacks; First DC GLAM boot-camp

In the media: Wikipedia's sexism; Yuri Gadyukin hoax

Featured content: Wiki loves video games

WikiProject report: Japanese WikiProject Baseball

Traffic report: Most popular Wikipedia articles

Arbitration report: Sexology closed; two open cases

Recent research: Sentiment monitoring; UNESCO and systemic bias; and more

Technology report: New notifications system deployed across Wikipedia

EarwigBot: Template:DRN case status

Hi! Could you look at my edits to Template:DRN case status and how EarwigBot undoes them? Am I doing something wrong?

BTW, my first attempt was to change the title on the main DRN page. In the past the bot has picked up on the change and updated the template, but that isn't happening either. --Guy Macon (talk) 06:19, 2 May 2013 (UTC)

I'll look into this. — Earwig ^talk 23:54, 2 May 2013 (UTC)

Okay, should be fixed now for the future. As an aside, I'm pretty sure the bot didn't pick up on any previous changes (it doesn't read the chart before updating it – it replaces it entirely). What probably happened was that you updated the chart and the case title at the same time, so it updated the chart based on the latter, not the former. Thanks. — Earwig ^talk 03:16, 4 May 2013 (UTC)

Thanks! --Guy Macon (talk) 04:17, 5 May 2013 (UTC)

Copyvio detector too shy

In three cases that I identified as heavy copyvios by hand/eye (now hidden from the reader, but still present in the source), the copyvio detector reports between 40 and 50% of confidence, and therefore claims "No violations detected" in a happy green box.[4][5][6] (see especially the details). I suggest to rather use a yellow box with "there are hints of copying" with scores as high as those. Also, calculating confidence on a sentence or paragraph level would be helpful to get a more distinct score. Would it be a problem if the tool would be used automatically to scan pages for copyvios (sequentially, of course)? (The response times lets me guess that it might be resource-intensive.) --Mopskatze (talk) 17:26, 4 May 2013 (UTC)

Yeah, I'm hesitant to support mass-reviewing (at least through the webpage) because each detection can take upwards of 30 seconds. I hadn't thought of a yellow "possible" box; this is a good idea that I will work on. I also haven't done enough research on confidence percentages of confirmed copyvios, but you've given a good indication that it might be too high right now (or the three-level system could fix that, I guess – we'll see). As for confidence on a sentence/paragraph level, I'm not completely sure what you mean. What would that look like? — Earwig ^talk 19:00, 4 May 2013 (UTC)

Sentence/paragraph level: to identify sections that have been copied as a whole, even if they constitute only a lesser part of the article (as in my examples), it would be useful to check if a common block (common between the WP article and the possible origin or copy) has a certain length in words (possibly ignoring single words and number representations), and to give it more weight in the total score if it does. On the other hand, parts marked as literal quotes ("" etc.) could receive a lower weight (even though the attribution still may be wrong). --Mopskatze (talk) 01:49, 5 May 2013 (UTC)

Ah, okay. That sounds useful, but I'm not sure how I could implement it (although I have some possible ideas). I'll make note of it for the future. Tweaking the confidence threshold should come soon. — Earwig ^talk 04:31, 5 May 2013 (UTC)

mwparserfrom hell - one more ref name character not handled properly

Hi Earwig, remember this? One more character to add to the list of problematic characters in parsing "ref name=...": Ampersand.

Any progress here? I'm still very interested in using mwparserfromhell to make a ref-processing bot, if there's anything I can do to help you debug or test please let me know. Zad68 14:09, 3 May 2013 (UTC)

Ah, man, sorry I forgot about this. The good news is that I'm almost done with unit tests (this weekend or at least by the middle next week), and after that I can focus completely on tags. I've added the two problems you mentioned earlier as issues on Github, so I won't forget them this time (30, 31). — Earwig ^talk 22:39, 3 May 2013 (UTC)

Excellent, thanks. I'm currently doing GA reviews or offering source review tables using mwparserfromhell but I have to do a lot of hand-massaging to the source Wikicode to make it parse. As soon as the parser works reliably I plan to offer it as a public tool on the Toolserver.

One more to look at: Parsing <div class="references-small"> fails as well, same reason I guess - because the parameter value has a hyphen in it, so that issue is not just limited to ref tag names. Thanks... Zad68 20:41, 6 May 2013 (UTC)

Yep, that's expected. I'll get on it soon. — Earwig ^talk 20:42, 6 May 2013 (UTC)

[purge-1] Information may not be current due to caching. Click here to purge this page's server cache.

[1]