Wikipedia:Bots/Requests for approval/GreenC bot 16
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: GreenC (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 02:53, Monday, May 6, 2019 (UTC)
Automatic, Supervised, or Manual: supervised
Programming language(s): GNU Awk and WP:BOTWIKIAWK libraries
Source code available: Yes
Function overview: A system to request 1 to 20 new {{Cleanup bare URLs}}
on-demand assuming the tracking category has 5 or less members.
Links to relevant discussions (where appropriate):
- Wikipedia:Bot_requests (main discussion)
- Template talk:Cleanup bare URLs
Edit period(s): On-demand
Estimated number of pages affected: 1-20 per request
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details: The bot control page User:GreenC bot/Job 16/bot documents how it works. The control page will be restricted to extended confirmed users and moved to Template:Cleanup bare URLs/bot when approved.
Discussion
[edit]@Steel1943, Derek R Bullamore, MarnetteD, and Meatsgains: Notification of BRFA. -- GreenC 02:56, 6 May 2019 (UTC)[reply]
- Looks good GreenC. I particularly like limiting it to 20 articles. That will be a big help in not overwhelming the category and those of us who work on it. One question is the bot designed to detect only bare urls in a ref tag. For instance when reflinks adds a dead link tag as it has done here the url is still bare but there is a dl template (and a set of single brackets) within the ref tag. I guess my question is will your bot readd the bare url template the next time it examines this article. If this question is confusing my apologies I can try to reword it if necessary. Thanks for you work on this. — Preceding unsigned comment added by MarnetteD (talk • contribs)
- MarnetteD, right now it's setup to find bare links anywhere not limited to inside refs. Should it limit checking to references only? My understanding of a bare link did not include links inside square brackets. Should they also be included, and if so, only those that have no title? Also in regards to
{{dead link}}
should it ignore bare links that are tagged as dead? I've never worked fixing bare links so this is all new. -- GreenC 21:56, 6 May 2019 (UTC)[reply]- Hmmm bare links anywhere is going to be interesting. It will catch pages where eternal links (those have no ref tags at all) have been added to the body of the article. That will be helpful. Along with the marked dead link refs that I have mentioned there are citations (mostly in science articles) that are only in brackets ([]) they have a bare URL along with all sorts of specific info as to where they came from - these are allowed per Wikipedia:Citing sources#Links and ID numbers. Until I can find a better example there are several here Warhammer 40,000: Dawn of War III#References. At times they contain more info than can be placed in a cite template. At the present time I just remove the tag and leave an edit summary like this. If the bot continually readds a bare url tag to that kind of page that will slow things down a bit. If there is any way to limit the bot to tagging only bare urls whether in ref tags or out that will be great. My apologies if this winds up making things more difficult for you. Also this is coming solely from my experience in working with bare urls. Others that work with them may have different reactions and thoughts. Regards. MarnetteD|Talk 22:28, 6 May 2019 (UTC)[reply]
- MarnetteD, the bot has a list of all articles (~5.7 million) and works through them start to end. Then it generates a new list and starts over. To reach the end could take a long time since it doesn't have to search long to find 20 bare link articles. To keep the bot off a page one could use
{{bots|deny=tagbot}}
, or modify the citations to be more standard. The only way to avoid bare links with nearby cite data, is to skip all references that contain bare links plus any other text, which can be done, but it will mean false negatives. Maybe that is for the best. -- GreenC 00:01, 7 May 2019 (UTC)[reply]- Thanks for the explanation GreenC. Between the time frame you mention and the ability to use the bots deny template I think that addresses my concerns. I hope you get other responses since, no matter how much editing one does, none of us have seen everything that can happen around here :-) Thanks again for taking the time to explain. MarnetteD|Talk 00:59, 7 May 2019 (UTC)[reply]
- MarnetteD, the bot has a list of all articles (~5.7 million) and works through them start to end. Then it generates a new list and starts over. To reach the end could take a long time since it doesn't have to search long to find 20 bare link articles. To keep the bot off a page one could use
- Hmmm bare links anywhere is going to be interesting. It will catch pages where eternal links (those have no ref tags at all) have been added to the body of the article. That will be helpful. Along with the marked dead link refs that I have mentioned there are citations (mostly in science articles) that are only in brackets ([]) they have a bare URL along with all sorts of specific info as to where they came from - these are allowed per Wikipedia:Citing sources#Links and ID numbers. Until I can find a better example there are several here Warhammer 40,000: Dawn of War III#References. At times they contain more info than can be placed in a cite template. At the present time I just remove the tag and leave an edit summary like this. If the bot continually readds a bare url tag to that kind of page that will slow things down a bit. If there is any way to limit the bot to tagging only bare urls whether in ref tags or out that will be great. My apologies if this winds up making things more difficult for you. Also this is coming solely from my experience in working with bare urls. Others that work with them may have different reactions and thoughts. Regards. MarnetteD|Talk 22:28, 6 May 2019 (UTC)[reply]
- MarnetteD, right now it's setup to find bare links anywhere not limited to inside refs. Should it limit checking to references only? My understanding of a bare link did not include links inside square brackets. Should they also be included, and if so, only those that have no title? Also in regards to
- Approved for trial (88 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. (4 runs) — xaosflux Talk 01:59, 8 May 2019 (UTC)[reply]
- Thanks. I need to code the above changes before it is ready to test. -- GreenC 02:51, 8 May 2019 (UTC)[reply]
@Steel1943, Derek R Bullamore, MarnetteD, Meatsgains, and DannyS712: - the bot is ready. For testing purposes. Set User:GreenC bot/Job 16/bot to RUN
and it will tag 2 pages (normally it will be more like 20). It will work with up to 50 pages in the tracking category (normally less). After 80 pages have been tagged it will stop working (or more than 50 in the tracking cat). -- GreenC 19:57, 8 May 2019 (UTC)[reply]
- GreenC it tagged about a dozen (guessing) and all seemed to go okay. One little fiddly bit request. Can the bot be designed to add the date when it places the tag? As you can see here anomiebot comes along and adds it. Every so often I will fix the refs then get an edit conflict because the date has been added to the tag then I will have to fix them again. Now this rarely happens so it it is a hassle for you don't worry about it. I'll let you know if anything odd happens in future runs. MarnetteD|Talk 21:32, 8 May 2019 (UTC)[reply]
- MarnetteD, I will leave it to you and others to make the trial edits. You are in command of the bot! I don't know if it makes a bad tag so depend on your feedback. During the trial period, each "RUN" request you make will tag 2 pages (not 20) to avoid any catastrophic problems. But I can increase it if you want. When it reaches the trial limit of about 80 edits total, it will stop working and we can let xaosflux know the trial is complete. Forgot the date, now added, thanks. -- GreenC 00:41, 9 May 2019 (UTC)[reply]
- @MarnetteD:, the bot has a new feature, you can now specify how many pages to tag from 1-20. The previous restriction of 2 per request is lifted since you can now set the number. It will default to 5 if no number is given. The docs are updated, issue "purge" if not visible. -- GreenC 14:14, 9 May 2019 (UTC)[reply]
- Thanks for the update GreenC. That is an excellent addition. I also appreciate the fact that it won't run if there are more than five articles already in the category. All of your work and refinements are much appreciated. MarnetteD|Talk 15:13, 9 May 2019 (UTC)[reply]
- @MarnetteD: No problem. This has been a fun bot! Glad it is of use. (the last run of "5" was slow to respond as there was a script typo) -- GreenC 16:28, 9 May 2019 (UTC)[reply]
- Thanks for the update GreenC. That is an excellent addition. I also appreciate the fact that it won't run if there are more than five articles already in the category. All of your work and refinements are much appreciated. MarnetteD|Talk 15:13, 9 May 2019 (UTC)[reply]
- @MarnetteD:, the bot has a new feature, you can now specify how many pages to tag from 1-20. The previous restriction of 2 per request is lifted since you can now set the number. It will default to 5 if no number is given. The docs are updated, issue "purge" if not visible. -- GreenC 14:14, 9 May 2019 (UTC)[reply]
- MarnetteD, I will leave it to you and others to make the trial edits. You are in command of the bot! I don't know if it makes a bad tag so depend on your feedback. During the trial period, each "RUN" request you make will tag 2 pages (not 20) to avoid any catastrophic problems. But I can increase it if you want. When it reaches the trial limit of about 80 edits total, it will stop working and we can let xaosflux know the trial is complete. Forgot the date, now added, thanks. -- GreenC 00:41, 9 May 2019 (UTC)[reply]
Trial complete. - MarnetteD has been using the bot making requests of around 5 pages each time. It now reached the trial limit of about 88 edits, each were checked. MarnetteD did you see any problems? -- GreenC 23:47, 11 May 2019 (UTC)[reply]
- I did not encounter any problems. The bot seems to have work as designed and has been a help in finding articles with bare urls. MarnetteD|Talk 01:09, 12 May 2019 (UTC)[reply]
- @Xaosflux:, I think MarnetteD would like to continue using this tool as it is working well, he was using it daily. But it can not be run because of the trial limit. Any chance we can expedite the BRFA, or set a high trial limit so it can continue to work for a while? -- GreenC 15:14, 13 May 2019 (UTC)[reply]
- @GreenC: so, there are a few tweaks I'd like to discuss having seen the run:
- Here is a sample edit made by the bot.
- You seem to be adding a parameter to Template:Cleanup bare URLs on the pages that isn't valid for the template, and at the least isn't documented in that template documentation, so perhaps updating that parameter and including it in the template doc to better reflect what is going on would help (note the user isn't "a bot" so maybe something like requestedby=?
- In the edit summary,
(Add {{Cleanup bare URLs}} (User:MarnetteD via tagbot))
can you add a little more, perhaps "Triggered by User:..." or "Initiated by User:...." to make it clear that that person isn't the bot operator.
- — xaosflux Talk 16:05, 13 May 2019 (UTC)[reply]
- @Xaosflux: Not sure why the sample edit isn't a valid use of the tag? Footnotes #1-4 and #8 are bare urls according to the template definition. Yes, will change from 'bot=' to 'requester=' or something. And will add a 'requested/initiated by' notice in the edit summary. -- GreenC 16:52, 13 May 2019 (UTC)[reply]
- Actually I misread sorry, I understand what your saying, yes agree the template documentation should be updated to include the
|bot=
or whatever it will be called. -- GreenC 16:57, 13 May 2019 (UTC)[reply]- Approved for extended trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. for you to make final adjustments and tests of this. — xaosflux Talk 17:19, 13 May 2019 (UTC)[reply]
- Actually I misread sorry, I understand what your saying, yes agree the template documentation should be updated to include the
- @Xaosflux: Not sure why the sample edit isn't a valid use of the tag? Footnotes #1-4 and #8 are bare urls according to the template definition. Yes, will change from 'bot=' to 'requester=' or something. And will add a 'requested/initiated by' notice in the edit summary. -- GreenC 16:52, 13 May 2019 (UTC)[reply]
- Process wise question: as far as "normal operations" go, is the workflow designed to be "Editor A" asks for this run, it runs, then "Editor A" goes and cleans things up? Because in that case, I can't see why the bot would actually need to edit the articles when it could just update a single page like User:GreenC bot/Job 16/TODO. — xaosflux Talk 16:05, 13 May 2019 (UTC)[reply]
- This is what was requested by the editors during the initial discussion. Since they are using the tracking category to organize work flow it didn't make sense to have a separate page/list. It integrates into the existing system rather than creating new systems to track and maintain. There are encouragements to keep the tracking category cleared through other systems(this and this), and the bot encourages re-filling when empty, to keep going - part of the overall system based around the tag and tracking category. Someone might make a request for 10 and only fix 5 because one took a long time, then leave for dinner, then someone else arrives and can fix the remainder since it's listed in the tracking category which is what everyone monitors. It is partly a nudge theory system as it nudges editors to keep the tracking category cleared. There is also the satisfaction of clearing the tag, which is a further encouragement to work. There wouldn't be much danger of tags going unfixed as the bot will not run if there are more than 5 unfixed. -- GreenC 16:52, 13 May 2019 (UTC)[reply]
- Seems fine. — xaosflux Talk 17:19, 13 May 2019 (UTC)[reply]
- I only just returned to the 'pedia after a busy morning. @Xaosflux:, GreenC is correct about how the category is cleared by those of us who work on cleaning up bare urls. At the moment I am the only one using the bot to add articles to the category but that may change as we progress. I know I would like to have the bot continue to add them to the existing category rather then have to check a different one. On some of my runs there would be bot tagged articles combined with editor tagged ones and it was convenient to fix both kinds at the same time. Thanks to you both for your efforts in this. MarnetteD|Talk 18:25, 13 May 2019 (UTC)[reply]
- @MarnetteD: thanks for the note, this aspect is fine. There are a few fairly minor tweaks for the operator above and then this will be ready to go live. — xaosflux Talk 19:23, 13 May 2019 (UTC)[reply]
- I only just returned to the 'pedia after a busy morning. @Xaosflux:, GreenC is correct about how the category is cleared by those of us who work on cleaning up bare urls. At the moment I am the only one using the bot to add articles to the category but that may change as we progress. I know I would like to have the bot continue to add them to the existing category rather then have to check a different one. On some of my runs there would be bot tagged articles combined with editor tagged ones and it was convenient to fix both kinds at the same time. Thanks to you both for your efforts in this. MarnetteD|Talk 18:25, 13 May 2019 (UTC)[reply]
- Seems fine. — xaosflux Talk 17:19, 13 May 2019 (UTC)[reply]
- This is what was requested by the editors during the initial discussion. Since they are using the tracking category to organize work flow it didn't make sense to have a separate page/list. It integrates into the existing system rather than creating new systems to track and maintain. There are encouragements to keep the tracking category cleared through other systems(this and this), and the bot encourages re-filling when empty, to keep going - part of the overall system based around the tag and tracking category. Someone might make a request for 10 and only fix 5 because one took a long time, then leave for dinner, then someone else arrives and can fix the remainder since it's listed in the tracking category which is what everyone monitors. It is partly a nudge theory system as it nudges editors to keep the tracking category cleared. There is also the satisfaction of clearing the tag, which is a further encouragement to work. There wouldn't be much danger of tags going unfixed as the bot will not run if there are more than 5 unfixed. -- GreenC 16:52, 13 May 2019 (UTC)[reply]
- @Xaosflux: How does this look? -- GreenC 19:46, 13 May 2019 (UTC)[reply]
- @GreenC: "operator" has special meaning (in this case it means YOU, the bot operator) so maybe "requester"? For the "wp:tagbot" that doesn't go anywhere due to casing, but WP:TAGBOT would work (or make another redirect if you love lower case :D ). Finally, Template:Cleanup bare URLs/doc at the least should be updated so someone unfamiliar with why thes parameters are showing on the template source code can figure out what they mean. — xaosflux Talk 20:09, 13 May 2019 (UTC)[reply]
- "requester" will work. Example. Another possibility is "proxy" which means acting on someone's behalf, but it also might be unnecessarily obscure compared to requester. Looks like lower-case wp:tagbot works automatically. I added some docs about the bot in the template documentation. -- GreenC 00:50, 14 May 2019 (UTC)[reply]
- Umm, it's a red link right above there - so it must be low on mana... — xaosflux Talk 03:41, 15 May 2019 (UTC)[reply]
- @Xaosflux: Ah didn't notice, it works when typed into the search box. Figured that was the way to access since it won't render as a template arg. Can make a redirect but will leave it red so you can check out the strange discrepancy in the search vs render engines. -- GreenC 14:22, 15 May 2019 (UTC)[reply]
- I created the redirect at Wikipedia:Tagbot, when you move the page, please update both that redirect at Wikipedia:TAGBOT. — xaosflux Talk 03:28, 17 May 2019 (UTC)[reply]
- "Requester" looks good. Go ahead and move your pages, get the protection set up (feel free to ping me on the new page talks), update the redirects, and finalize that part of the code, do 20 to 100 edits, then ping me here for final approval and close out of this when you get the time. — xaosflux Talk 03:30, 17 May 2019 (UTC)[reply]
- Ok everything is switched over. The bot is tracking number of edits and will stop working once it reaches 100. Will ping you when. -- GreenC 14:04, 17 May 2019 (UTC)[reply]
- @Xaosflux: Ah didn't notice, it works when typed into the search box. Figured that was the way to access since it won't render as a template arg. Can make a redirect but will leave it red so you can check out the strange discrepancy in the search vs render engines. -- GreenC 14:22, 15 May 2019 (UTC)[reply]
- Umm, it's a red link right above there - so it must be low on mana... — xaosflux Talk 03:41, 15 May 2019 (UTC)[reply]
- "requester" will work. Example. Another possibility is "proxy" which means acting on someone's behalf, but it also might be unnecessarily obscure compared to requester. Looks like lower-case wp:tagbot works automatically. I added some docs about the bot in the template documentation. -- GreenC 00:50, 14 May 2019 (UTC)[reply]
- @GreenC: "operator" has special meaning (in this case it means YOU, the bot operator) so maybe "requester"? For the "wp:tagbot" that doesn't go anywhere due to casing, but WP:TAGBOT would work (or make another redirect if you love lower case :D ). Finally, Template:Cleanup bare URLs/doc at the least should be updated so someone unfamiliar with why thes parameters are showing on the template source code can figure out what they mean. — xaosflux Talk 20:09, 13 May 2019 (UTC)[reply]
- Looking copacetic to me also! That should be a bot name. Thanks for your work clearing all of these URLs and running/testing the bot. -- GreenC 13:39, 25 May 2019 (UTC)[reply]
- Approved. — xaosflux Talk 13:55, 25 May 2019 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.