Wikipedia:Bots/Requests for approval/mmbot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Request Expired.
Operator: Mmovchin (talk · contribs)
Time filed: 23:42, Monday February 20, 2012 (UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: Standard pywikipedia
Function overview: Adds missing reference sections (<references />)
Links to relevant discussions (where appropriate): Xqbot Mobius_Bot UsbBot Prombot
Edit period(s): Every 10 minutes
Estimated number of pages affected: 2-5 per edit
Exclusion compliant (Y/N): No, because every site where references are given needs a <references />-tag.
Yes, implemented.
Already has a bot flag (Y/N): no
Function details: Just adds missing reference sections (<references />). It get's his pages from WP:NOREFLIST. Examples: 1, 2, 3, 4.
Discussion
[edit]- Comment - The requested functionality duplicates that of User:Xqbot, which already does a fine job with only one or two passes per day. -- WikHead (talk) 00:33, 21 February 2012 (UTC)[reply]
- My personal view is that duplicated functionality is a good thing.Josh Parris 04:10, 21 February 2012 (UTC)[reply]
It seems that {{reflist}} is the preferred option nowadays; is it possible to use that instead? Josh Parris 04:10, 21 February 2012 (UTC)[reply]
- @Josh Parris: Yes, of course. This is not a problem, see example 5 and 6.--mmovchin Talk 10:55, 21 February 2012 (UTC)[reply]
- Also, please don't edit via bot account in mainspace until BAG gives you a trial. — HELLKNOWZ ▎TALK 10:59, 21 February 2012 (UTC)[reply]
- What happens if the page already contains a "References" or similarly named section?
- How does the bot determine where to insert the section if needed?
- Given references tag is often removed by vandalism (or occasionally error), how long does the bot wait before performing the edit on the article? How long for IPs and editors with very few edits?
- Exclusion compliant: "No, because every site where references are given needs a <references />-tag." -- that is not a reason to make the bot non-exclusion compliant. Every approved bot task has consensus, therefore everyone could argue that there is no reason the bot wouldn't be disallowed. What if a certain page used a syntax you do not anticipate and cannot easily fix, therefore requiring the page to be blacklisted, for example? — HELLKNOWZ ▎TALK 10:58, 21 February 2012 (UTC)[reply]
mmovchin, have you read and understood WP:BOTPOL? Josh Parris 11:16, 21 February 2012 (UTC)[reply]
Some people might say that with less than 400 edits under your belt, you may not be ready to operate a bot. How would you respond to that? Josh Parris 11:16, 21 February 2012 (UTC)[reply]
- He's got 3298 on dewiki. — HELLKNOWZ ▎TALK 11:17, 21 February 2012 (UTC)[reply]
- And he's a Huggle developer, and has a computer science background. Scrub that. Josh Parris 11:40, 21 February 2012 (UTC)[reply]
- HELLKNOWZ:
- 1) When a article already contains a section named "References", "Footnotes" or "Notes" it looks whether where is already a references-tag or any reflist-template. If not, if simply adds it (See example 5).
- 2) References sections are usually placed before further reading / external link sections. For example, on this wiki, the script would place the "References" section in front of the "Further reading" section, if that existed. Otherwise, it would try to put it in front of the "External links" section, or if that fails, the "See also" section, etc.
- 3) If a references-tag or reflist-template gets removed while there are still references given in the article, it simply adds the reflist again. It checks every 10 minutes this here: WP:NOREFLIST.
- 4) I have implemented this possibility now.
- @Josh: Yes, I have read and understood WP:BOTPOL.
- @Josh and HELLKNOWZ: See here. I have 3298 edits on dewiki and 359 edits on enwiki. Or do you want to see some of my programming skills?--mmovchin Talk 11:51, 21 February 2012 (UTC)[reply]
- I don't have any issues with that. — HELLKNOWZ ▎TALK 11:59, 21 February 2012 (UTC)[reply]
- And he's a Huggle developer, and has a computer science background. Scrub that. Josh Parris 11:40, 21 February 2012 (UTC)[reply]
1) What about any other match, like "References and external links" [1]. Those three are definitely not the only ones possibly. "References and notes" is often used.
1.A) What are the markup syntaxes and templates the bot considers to be references. We have quite a few last I checked.
2) And when it fails to find any of those hardcoded section names?
3) That's not quite what I meant. Example: IP comes and vandalizes the page, removing {{reflist}}. 10 seconds later the bot is running a scheduled run. Bot restores the {{reflist}}, but doesn't account for vandalism. Editors see bot edit and don't verify if previous version had vandalism. This happens a lot, more so with bots that fix markup issues that are caused by vandalism and such. — HELLKNOWZ ▎TALK 11:59, 21 February 2012 (UTC)[reply]
- 1)Thanks, I added "References and notes" and "References and external links". The bot is only able to detect suchs sections if he knows how they could be named.
- 1A)
'en': [u'Reflist', u'Refs', u'FootnotesSmall', u'Reference', u'Ref-list', u'Reference list', u'References-small', u'Reflink', u'Footnotes', u'FootnotesSmall']
and the references-tag. - 2) When it couldn't find a ref-section and one of the tags and templates in 1A then it makes it's own one named "References" and with the Reflist-template.
- 3) I could implement that the bot will wait 10 minutes after the last edit of an article.--mmovchin Talk 12:39, 21 February 2012 (UTC)[reply]
- 1) Those are not the only examples though. There are lot's of others, like "Works cited", "Bibliography", "Published works", etc. I think you should take care to detect those words mentioned in any sections. I know the chances are low and I'm sorta nitpicking, but we need to account for false positives since this is an automated process. For example, it could be "References, notes".
- 1A) Does it account for {{Refbegin}}/{{Refend}} or {{Notelist}}?
- 1B) What happens when there are several sections, like both "References" and "Citations"?
- 3) At least. — HELLKNOWZ ▎TALK 20:54, 21 February 2012 (UTC)[reply]
It may be valuable to look at the source code for AWB's module AddMissingReflist. Josh Parris 21:21, 21 February 2012 (UTC)[reply]
How are things going? Josh Parris 21:53, 25 February 2012 (UTC)[reply]
- Oh, sorry. This here hasn't appeared on my watchlist.
- 1) If the bot simply checks keywords, there will be many false inserets. I think we do not have to include references to such sections like "Works cited", "Bibliography", "Published works", etc. If there are given references to an article, but there is no reference list, then the bot adds a section with such a list. So we need to add all capabilities how such a section could be named, otherwise the bot adds his own.
- 1A) Yes.
1B) Also yes, it does.2)1B) "Citations" is not a section to place references there. "References" is preferred. After this there are "Footnotes" and "Notes".--mmovchin Talk 13:30, 26 February 2012 (UTC)[reply]
- 1) But that's the problem. The bot could be inserting "References" section when there already is a section for references but it's named (slightly) differently. So if there is "Referenced material" the bot would insert another "References" section. If the bot had a blacklist of words when to leave the article for manual checking, it would not make that edit, because "reference" would trigger a skip. I'm afraid references are a touchy subject on English Wikipedia (notably, WP:CITEVAR) and bots should exercise care when making assumptions about what other may or may not have done. AWB can make mistakes, automated bots shouldn't.
- 1B) I asked " What happens when there are several sections, like both "References" and "Citations"?"; you replied "Also yes, it does." Sorry, but I'm not sure what you meant. This is a typical case to avoid editing by bot and leaving it for manual checking.
Trial
[edit]Anyway, Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Let's see how it runs. — HELLKNOWZ ▎TALK 13:47, 26 February 2012 (UTC)[reply]
- Oh, I answered 1A) two times. I've corrected it now.--mmovchin Talk 13:53, 26 February 2012 (UTC)[reply]
- Okay, so what about "Reference works" and "Reference list" in the same article? My point is that no matter how many section headlines you hardcode, I'll make up a few new ones and the bot would in principle have false positives. :) Which is a good reason to avoid editing articles you are not 99% sure will be correct. — HELLKNOWZ ▎TALK 13:56, 26 February 2012 (UTC)[reply]
- Is it better to analyze whether a section-name starts with "Reference"? But let's see how it works. There were some problems with the crontab on toolserver but know everything should work correctly. I will stop the cron when 50 edits are made.--mmovchin Talk 15:03, 26 February 2012 (UTC)[reply]
- Okay, so what about "Reference works" and "Reference list" in the same article? My point is that no matter how many section headlines you hardcode, I'll make up a few new ones and the bot would in principle have false positives. :) Which is a good reason to avoid editing articles you are not 99% sure will be correct. — HELLKNOWZ ▎TALK 13:56, 26 February 2012 (UTC)[reply]
- Oh, I answered 1A) two times. I've corrected it now.--mmovchin Talk 13:53, 26 February 2012 (UTC)[reply]
Due to lack of time, it is to me not possible to solve the problem before 02/05/2012. The reason for this is that I am not at home this week. Once I'm at home, I will fix the problem on the toolserver.--mmovchin Talk 22:46, 27 February 2012 (UTC)[reply]
- I'm going to presume you mean 2012-03-05, next Monday. Noted. Josh Parris 00:13, 28 February 2012 (UTC)[reply]
{{OperatorAssistanceNeeded|D}}
Any news? MBisanz talk 02:45, 14 March 2012 (UTC)[reply]
- Sorry, but this request is gonna have to be marked as expired. The 50 edit trial has not been completed (20 edits only completed) and the bot stopped editing a month ago. No one has replied to the OperatorAssistanceNeeded tag above; the case seems to have gone cold. Now if I was BAG... Rcsprinter (rap) 16:33, 28 March 2012 (UTC)[reply]
Request Expired. for now. Feel free to reopen if you get back. — HELLKNOWZ ▎TALK 16:40, 28 March 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.