Wikipedia:Bots/Requests for approval/Bot24
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Negative24 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 04:19, Thursday, December 10, 2015 (UTC)
Automatic, Supervised, or Manual: Supervised
Programming language(s): Python (PWB)
Source code available: [1]
Function overview: (this is only for task 3 at the moment) Renames Mexican TV station callsign redirects and then modifies all links from the old redirect to the new redirect.
Links to relevant discussions (where appropriate): User:Bot24/Tasks/Task 3#Consensus and User talk:Negative24/Archives/2016/January#Bot update?
Edit period(s): Multiple batch one time runs
Estimated number of pages affected: 120 to 200 pages on first batch
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): No
Function details: Script renames a batch of redirects specified in a Python file with tuples corresponding to original (currently existing) redirects and new redirects. Script then makes sure the new redirect points to the target the old redirect pointed to. It then finds all references to the old redirect, finds the links, and then if the link points to the old redirect, it modifies the link to point to the new redirect, keeping the original section, label, and any other arguments (it only targets the title to begin with).
Redirects to move are provided by User:Raymie when the TV stations make the change.
Please refer to the tests done here.
Discussion
[edit]- I'm going to mention a few more specifics of why this is needed. The Mexican digital television transition means that every single television station in Mexico is changing its callsign and will end in -TDT instead of -TV. Most Mexican station pages are redirects, as they do not originate programming and are little more than satellite-fed transmitters. A good example is the page XHFEC-TV, for a TV station in the small town of San Felipe, Baja California. Its incoming links are largely from lists of Mexican TV stations, so all the incoming links can be changed without breaking issues. It redirects to Azteca Trece, which is the network it carries.
- There will be five batches, timed to match the analog shutoffs as decreed by the Federal Telecommunications Institute, on December 11, 16, 17, 22, and 31. The first four will consist of 27, 61, 27‡, and 51 stations, respectively (the third batch contains Mexico City and has many full station articles). The 12/31 batch will include any television station in Mexico known to be on the air in digital and/or bearing the corresponding IFT technical authorization to broadcast in digital; by then there may be additional documents about the 12/31 shutoff date.
- ‡There are actually 28 stations in the third batch, but XHPUE-TDT went digital-only in March despite what the IFT keeps saying. It was moved at the time. Raymie (t • c) 04:49, 10 December 2015 (UTC)[reply]
- So it's not really moving the redirects, right? Just creating new redirects and updating links that refer to the old ones? First part should be fine, but updating links presents a few problem areas:
- You'd need to check to ensure historical mentions aren't changed.
- I'm not sold on updating link titles that are piped (WP:NOTBROKEN comes to mind). I suspect there aren't many of those.
- The bot's also not addressing instances like
[[Azteca Trece|XHFEC-TV]]
. - Would be a good idea to batch link updates, so that the bot doesn't have to make a few dozen edits to pages like Television stations in Chihuahua.
- Can you give a source to show these callsigns are really changing? I'm having trouble finding one that's clear. — Earwig talk 22:22, 10 December 2015 (UTC)[reply]
- So it's not really moving the redirects, right? Just creating new redirects and updating links that refer to the old ones? First part should be fine, but updating links presents a few problem areas:
- @The Earwig: No, it isn't actually "moving" the redirect. It just makes the new redirect if needed and changes links to the new redirect.
- The bot will only be editing in the main and template namespaces. Any historical mentions such as conversations on talk pages wouldn't be changed. But it would be a problem if the link had to be preserved in article content for historical reasons (e.g. "Jimmy Whales started his first radio station with the callsign 'XHFEC-TV'.") @Raymie: Do you have any examples of this?
- I don't fully understand what you mean but the script doesn't touch (or in Regex, target) any section or pipe/label (it also skips categories and files). It only targets the link title if it has brackets around it in a link syntax.
- Like above, the script wouldn't even see this link since it isn't referencing the old redirect. The script doesn't care about links to the target page, only links to the old redirect.
- The script replaces all link occurrences and then saves the page. So, unless the page has been changed to have more links to the old redirect and the script is run again with the same old redirect, the script should only edit each page once.
- @Raymie: question for you (sources of callsign changes). -24Talk 01:00, 11 December 2015 (UTC)[reply]
- It's kind of hard to quantify, but there are lists of all the stations that are changing for the 11th, 16th, 17th and 22nd. For the early shutoffs: 12/11, 12/16, 12/17, and 12/22. Anything with "Complementaria" is not changing as those are repeaters of other stations that may or may not be located in areas shutting off.
- @The Earwig: No, it isn't actually "moving" the redirect. It just makes the new redirect if needed and changes links to the new redirect.
- Also worth noting that in the RPC, the sort-of Mexican version of the FCC Query, the callsigns do change. For instance, XEPM-TDT changed over in July. Searching "XEPM-TV" will not pull up the entry corresponding to XEPM now. Additionally, when new digital stations are authorized example, their callsigns are listed as ending in -TDT (see page 6). Raymie (t • c) 01:20, 11 December 2015 (UTC)[reply]
- Yes, I'm referring to historical mentions of the old callsigns in articles, as in the example you gave.
- Updating linked titles that are piped – for example, if you have
[[XHFEC-TV|XHFEC]]
somewhere, the bot would change it to[[XHFEC-TDT|XHFEC]]
, and I'm not sure that's really necessary, as it has no visual effect on the page. - The problem is that this leads to an inconsistency that, I think, may require human intervention. Consider how the bot would handle the text
A Mexican TV station, [[XHFEC-TV]], [some other text]. XHFEC-TV is [some more text]
. The first mention would be changed, but not the second. - Based on my understanding of the code, if the bot were to do ten redirect renames (and each one is found in Television stations in Chihuahua), it would make ten separate edits to that page. You should be able to fix this by waiting until the end of the run to edit the pages, keeping track of the necessary changes along the way.
- To restate my main concern: bots that change article text are often prone to issues (see WP:CONTEXTBOT), unless you're manually checking each edit. But I've noticed that nearly all of the uses of these callsigns are in tables or lists, which are predictable. If you can have the bot ensure it only edits inside of lists or tables (or templates), I think we're less likely to have problems. — Earwig talk 01:28, 11 December 2015 (UTC)[reply]
- @The Earwig:
- See below.
- Yes, it would change those links and it wouldn't be necessary but it wouldn't hurt (and would make the links somewhat consistent).
- Yes, that is an issue. See below.
- An explanation for that part of the script: Every page that refers to the old redirect is looked through ([2]), it searches through for every link and changes the ones that match the old redirect in the text variable ([3]), then after all the links have been changed, the script saves the page with the new text ([4]). The text variable isn't saved every time, only when the page.save function is called.
- I would be fine with only changing lists, tables, and templates. I could also have it compile a list of pages that still reference the old redirect so that they can be looked at manually. @Raymie: Would this be an acceptable solution? -24Talk 02:36, 11 December 2015 (UTC)[reply]
- @The Earwig:
- Sorry... I think you still don't understand me. If you run the bot to move X distinct redirects that all occur on the same page, it'll make X separate edits to that page over the course of the run. Right? Since a lot of these occurrences tend to be on lists of many redirects, it should be possible to dramatically reduce the number of edits necessary. Also, just to clarify one further point, will it do anything to non-redirects like XHDEH-TV? — Earwig talk 02:55, 11 December 2015 (UTC)[reply]
- Ah, yes. I see what you mean now. Let me check out a few solutions... -24Talk 03:25, 11 December 2015 (UTC)[reply]
- The number of non-redirects is rather low as a share of all of the moves that need to be made. The Mexico City 12/17 batch will have a lot of these, but there aren't as many in the other batches. Should I go ahead and manually move the ones in the 12/11 batch, like XHLEG-TV, XHRCG-TV, etc.? Raymie (t • c) 05:17, 11 December 2015 (UTC)[reply]
- @Raymie: I can have a fix out (restricting to tables and lists and delay saves until the end) by mid-afternoon and if we get approved we still might be able to get the first batch running by evening. -24Talk 18:23, 11 December 2015 (UTC)[reply]
- Alright, that's a good plan. Raymie (t • c) 19:17, 11 December 2015 (UTC)[reply]
- @The Earwig: Would checking for an asterisk at the beginning of the line be sufficient for checking if a link is in a list? To answer your question regarding if the script would act on a non-redirect, no, if a non-redirect page was specified in the redirect file the script would exit with an error ([5]). -24Talk 01:58, 12 December 2015 (UTC)[reply]
- @The Earwig: Or perhaps it would be better process the links if the page starts with "List of". -24Talk 00:08, 13 December 2015 (UTC)[reply]
- Television stations in Chihuahua doesn't, so not sure if that'll work. The asterisk idea seems fine. Keep a log of what it skips, and after the trial we'll review it to make sure everything is working as expected. — Earwig talk 00:10, 13 December 2015 (UTC)[reply]
- Here we go, top of page 13 in the original 2004 Cofetel determination to use ATSC: Con objeto de facilitar la relación entre las estaciones analógicas y sus correspondientes equipos de canales adicionales digitales, se utilizará el mismo distintivo de llamada pero con la terminación “TDT”. (To facilitate the relationship between analog stations and their corresponding digital counterparts, the same callsign will be used with the ending "TDT".) That particular policy was abrogated when a new one was brought in last year, but the practice is the same. Raymie (t • c) 05:43, 13 December 2015 (UTC)[reply]
- Also, none of the 32 state lists begin with "List of", which differs from most other cases (e.g. Television stations in Chihuahua vs. List of television stations in Arizona). Should they be moved as part of this project to facilitate the use of "List of" as a cue to edit once? Raymie (t • c) 05:46, 13 December 2015 (UTC)[reply]
- That sounds reasonable to me (per convention, and since they're lists of television stations, not discussion about television stations in the region), but outside of the scope of this BRFA. The PDF seems to support your point; I've raised a clarification request to be sure. — Earwig talk 06:26, 13 December 2015 (UTC)[reply]
- Wikipedia talk:WikiProject Television Stations is probably closer to what you're looking for in terms of a forum for feedback. WP:TV seems to deal more with TV series. Thanks for dealing with me — I understand this is a strange request, as one person has the technical expertise and the other is trying to coordinate it as part of what honestly is a rather convoluted project. Raymie (t • c) 06:37, 13 December 2015 (UTC)[reply]
- Y'know, you're completely right on that, but I passed on them since they seem fairly inactive and I wouldn't want the post to languish forever. — Earwig talk 06:39, 13 December 2015 (UTC)[reply]
- I made one post about the transition last month and got some feedback from a couple users, which I never saw until now, pretty favorable to my current approach. One did like the idea of creating new redirects instead of the old ones. I'll ping them so they see this: @Mrschimpf: @Mlaffs: Raymie (t • c) 07:03, 13 December 2015 (UTC)[reply]
- You rang? I'll weigh in with a few thoughts. 1) This digital conversion is definitely happening; it's already occurred in many areas of Mexico, similar to what's occurred in both Canada and the U.S. 2) I'd vote in favour of moving the state pages to "List of...", particularly if it makes this task easier. It's always seemed a bit inconsistent to me. 3) The concern about context is absolutely legitimate. I've just been in the process of cleaning up behind some of the changes that have already happened manually over the last few months. There have often been instances - perhaps a third of the total - where the reference is historical. 4) Notwithstanding WP:NOTBROKEN, it's a relatively common practice to change redirect links when dealing with radio and television stations in the U.S. Stations change their call sign and then the old call sign gets reused. I've had to fix literally thousands of links over the years that now point to the wrong place, so editing the redirects up front (and piping where necessary for context) is part of my standard practice when moving articles due to call sign changes. I don't think there's quite this same concern with Mexico, but I think there's still value in making sure the correct call sign is used in list articles, given how much they act as a pure reference source.
- To sum that all up, I think it's absolutely safe and helpful for the bot to create the new redirects, and to make automated updates in list articles (to both pipes and bare links). I'd be a little more concerned with updating regular article references, as they need review.
- As I think I did on that WT:TVS post Raymie mentioned, I'm happy to help out, whether with renaming the lists beforehand or doing the cleanup of anything the bot doesn't manage. Exactly the sort of gnome-like work that's my raison d'être around here. Mlaffs (talk) 14:26, 13 December 2015 (UTC)[reply]
- I made one post about the transition last month and got some feedback from a couple users, which I never saw until now, pretty favorable to my current approach. One did like the idea of creating new redirects instead of the old ones. I'll ping them so they see this: @Mrschimpf: @Mlaffs: Raymie (t • c) 07:03, 13 December 2015 (UTC)[reply]
- Y'know, you're completely right on that, but I passed on them since they seem fairly inactive and I wouldn't want the post to languish forever. — Earwig talk 06:39, 13 December 2015 (UTC)[reply]
- Wikipedia talk:WikiProject Television Stations is probably closer to what you're looking for in terms of a forum for feedback. WP:TV seems to deal more with TV series. Thanks for dealing with me — I understand this is a strange request, as one person has the technical expertise and the other is trying to coordinate it as part of what honestly is a rather convoluted project. Raymie (t • c) 06:37, 13 December 2015 (UTC)[reply]
So realistically, this needs to be sorta two-phase instead:
- The redirects are all created. This involves creating, e.g., CALLSIGN-TDT to point to wherever CALLSIGN-TV is currently redirecting OR if CALLSIGN-TV is not a redirect, CALLSIGN-TV—the article— needs to be moved to CALLSIGN-TDT with the resulting, auto-generated redirect left behind.
- A decision is made as to what to do with existing links and/or the text, if anything.
For #1, I see no major problem.
For #2, however, there are likely to be context issues that require human oversight. For example, if an article has text along the lines of
The first of these stations was CALLSIGN-TV
We don't want to change that to
The first of these stations was CALLSIGN-TDT
...as that's factually wrong in-context. Similarly, we need to ensure that
The current callsign is CALLSIGN-TV
eventually is changed to
The current callsign is CALLSIGN-TDT
...or left alone until someone eventually gets around to manually updating them. Either way, pipes will also need to be left alone or updated based on context. So, like Earwig was saying, #2, if anything is even to be automated as far as it goes, needs to be limited to changes directly approved by a human OR a clear set of pages that we know are safe to bulk-update because they have no context (e.g., all instances will predictably be "The current callsign is CALLSIGN-TDT").
Furthermore, we need to ensure the bot only edits any given page, at maximum, once or twice. It can't be making 10-20 edits to a page just to update link text, for example. --slakr\ talk / 07:26, 13 December 2015 (UTC)[reply]
- I should also note that it's okay if the bot doesn't do all of #1; the page moves, for example, can be done manually. The core takeaway is that if a bot is doing something, we want it to be doing it correctly. What it can't do can be filled in by humans where needed. --slakr\ talk / 07:36, 13 December 2015 (UTC)[reply]
- At this point in time this is what's planned:
- The bot will only make new redirects for the old redirects specified in the file (it will not move any callsign articles)
- The bot will only fix links that are in a wikitable or in a bulleted list (it will not search for 'List of' in the title since that is outside of the scope of this BRFA)
- The bot will compile a list of pages with remaining links to the old redirect
- Let me know if something needs to be added or if any of this is wrong. I'm currently working on getting the parsing of wikitables and lists working. Its more tricky than I thought it would be. Thanks, -24Talk 17:55, 13 December 2015 (UTC)[reply]
- At this point in time this is what's planned:
- I have moved 31 of the 32 lists to begin with "List of", as it was probably time for that to happen anyway and it will make the bot's work easier. The exception has an edit history reaching back to 2003 and needs a histmerge, which has been requested. I'd also like to ask that we also do the "Channel XX TV stations in Mexico" series as part of this too — basically everything in this template. Raymie (t • c) 03:53, 14 December 2015 (UTC)[reply]
- If we're moving the "Channel XX" series for Mexico, we should move the similar pages for Canada and the U.S. as well. They were all created at the same time to listify categories due to a series of CfDs. Mlaffs (talk) 23:17, 14 December 2015 (UTC)[reply]
- Would it be better to go back to the "List of" qualification for lists? That would make this a bit easier. -24Talk 01:25, 15 December 2015 (UTC)[reply]
- I didn't mean moving the lists, just running the bot on those. You should be able to use the "List of" qualification on the state lists now except for Q. Roo which requires a histmerge. Raymie (t • c) 02:32, 15 December 2015 (UTC)[reply]
- Exactly what I meant. I was asking if the bot should operate on "List of" pages instead of just working on links with an asterisk. -24Talk 04:33, 15 December 2015 (UTC)[reply]
- It should operate on pages beginning with "List of" and "Channel". Raymie (t • c) 05:17, 15 December 2015 (UTC)[reply]
- Exactly what I meant. I was asking if the bot should operate on "List of" pages instead of just working on links with an asterisk. -24Talk 04:33, 15 December 2015 (UTC)[reply]
- I didn't mean moving the lists, just running the bot on those. You should be able to use the "List of" qualification on the state lists now except for Q. Roo which requires a histmerge. Raymie (t • c) 02:32, 15 December 2015 (UTC)[reply]
- I have added more robust edit conflict and page creation conflict support and have added support for "List of" and "Channel" pages. For the table support would it be alright to check in between {{Mexico TV station table/top}} or {{Mexico TV station table/top2}} and |}? -24Talk 16:13, 15 December 2015 (UTC)[reply]
- Yes, it should be. I need to rebuild the Televisa networks' station lists but have been waiting until after December 31 to do them. Also, if you do testing, you now need to add the 12/16 batch as well. Raymie (t • c) 20:15, 15 December 2015 (UTC)[reply]
- How many edits in the first batch given the change we made to combine edits ("120 to 200 pages" was said above)? I usually don't like to go above 50 for first trials since that makes them extremely tedious to review (and more work to clean up if something goes wrong). — Earwig talk 20:29, 15 December 2015 (UTC)[reply]
- The more we wait, the more edits we need — this is why we want to test as soon as possible. The first four batches will consist of 27, 61, 27, and 51 stations, respectively. I don't know how many edits that adds up to. Batch 1 was needed for 12/11. Batch 2 goes tomorrow, Batch 3 on Thursday, and Batch 4 on 12/22. There's also a batch 5, for 12/31, that will have 99 stations in it. Raymie (t • c) 20:55, 15 December 2015 (UTC)[reply]
- WP:NODEADLINE and all that—it's okay if we miss a couple days. Approved for trial (first batch, up to 100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. — Earwig talk 21:25, 15 December 2015 (UTC)[reply]
- Trial complete. Bot24 just completed almost all of the first batch (we left off 8 redirects to make sure we were under the 100 edit limit). 73 edits were made, no errors reported. See Special:Contributions/Bot24. The link log is posted here and says that 6 links were skipped. -24Talk 21:04, 18 December 2015 (UTC)[reply]
- Think there were a few more than that skipped (non-article links), as I just ran through and cleaned up a few. @Raymie:, there are a few considerations I think we missed in this. For one, the network articles sometimes still need the digital channel number to be added. Second, the Channel XX page for the digital channel often has a notation that it's digital channel vs. the analog one, and that can be removed once the call sign changes. Finally, the entry for the station on its analog Channel XX page has to be removed. I've gone through and done that on the redirects created through this trial, but that'll all have to be done going forward. If there's a list kept each time of which stations the bot handles, as there was here, that'd be helpful for that cleanup. Mlaffs (talk) 00:14, 19 December 2015 (UTC)[reply]
- @Mlaffs:, the first concern should be temporary as I plan on doing a rebuild of the remaining network lists after 12/31. I did the Azteca ones first because I had more complete digital information. Televisa networks and Canal Once I have not done yet. As for the last two items, that will have to be done, but the existing lists should make that easier. Raymie (t • c) 00:34, 19 December 2015 (UTC)[reply]
- In fact, I just rebuilt the Canal Once list from the new list style, using much more updated information and all -TDT prefixes. (This leads to a few red links now, but it shouldn't in the long run.) Raymie (t • c) 02:05, 19 December 2015 (UTC)[reply]
- Think there were a few more than that skipped (non-article links), as I just ran through and cleaned up a few. @Raymie:, there are a few considerations I think we missed in this. For one, the network articles sometimes still need the digital channel number to be added. Second, the Channel XX page for the digital channel often has a notation that it's digital channel vs. the analog one, and that can be removed once the call sign changes. Finally, the entry for the station on its analog Channel XX page has to be removed. I've gone through and done that on the redirects created through this trial, but that'll all have to be done going forward. If there's a list kept each time of which stations the bot handles, as there was here, that'd be helpful for that cleanup. Mlaffs (talk) 00:14, 19 December 2015 (UTC)[reply]
- Trial complete. Bot24 just completed almost all of the first batch (we left off 8 redirects to make sure we were under the 100 edit limit). 73 edits were made, no errors reported. See Special:Contributions/Bot24. The link log is posted here and says that 6 links were skipped. -24Talk 21:04, 18 December 2015 (UTC)[reply]
- WP:NODEADLINE and all that—it's okay if we miss a couple days. Approved for trial (first batch, up to 100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. — Earwig talk 21:25, 15 December 2015 (UTC)[reply]
- The more we wait, the more edits we need — this is why we want to test as soon as possible. The first four batches will consist of 27, 61, 27, and 51 stations, respectively. I don't know how many edits that adds up to. Batch 1 was needed for 12/11. Batch 2 goes tomorrow, Batch 3 on Thursday, and Batch 4 on 12/22. There's also a batch 5, for 12/31, that will have 99 stations in it. Raymie (t • c) 20:55, 15 December 2015 (UTC)[reply]
- How many edits in the first batch given the change we made to combine edits ("120 to 200 pages" was said above)? I usually don't like to go above 50 for first trials since that makes them extremely tedious to review (and more work to clean up if something goes wrong). — Earwig talk 20:29, 15 December 2015 (UTC)[reply]
- Yes, it should be. I need to rebuild the Televisa networks' station lists but have been waiting until after December 31 to do them. Also, if you do testing, you now need to add the 12/16 batch as well. Raymie (t • c) 20:15, 15 December 2015 (UTC)[reply]
- I have moved 31 of the 32 lists to begin with "List of", as it was probably time for that to happen anyway and it will make the bot's work easier. The exception has an edit history reaching back to 2003 and needs a histmerge, which has been requested. I'd also like to ask that we also do the "Channel XX TV stations in Mexico" series as part of this too — basically everything in this template. Raymie (t • c) 03:53, 14 December 2015 (UTC)[reply]
- @The Earwig: Was the trial acceptable? -24Talk 00:23, 21 December 2015 (UTC)[reply]
- I will try to take a look tonight. — Earwig talk 00:26, 21 December 2015 (UTC)[reply]
- @The Earwig: Any further progress? (I'm sure you're busy, I'm just reminding you.) Due to time, the next batch we do will probably be very large and will include 387 stations, by my count — 9 stations not done from 12/11, 61 from 12/16, 27 from 12/17, 51 from 12/22 and 240 from 12/31. Raymie (t • c) 18:36, 26 December 2015 (UTC)[reply]
- Don't see any real issues here, and this is supervised, so if problems come up they can be easily dealt with. Approved. — Earwig talk 04:18, 29 December 2015 (UTC)[reply]
- @The Earwig: Any further progress? (I'm sure you're busy, I'm just reminding you.) Due to time, the next batch we do will probably be very large and will include 387 stations, by my count — 9 stations not done from 12/11, 61 from 12/16, 27 from 12/17, 51 from 12/22 and 240 from 12/31. Raymie (t • c) 18:36, 26 December 2015 (UTC)[reply]
- I will try to take a look tonight. — Earwig talk 00:26, 21 December 2015 (UTC)[reply]
- @The Earwig: Was the trial acceptable? -24Talk 00:23, 21 December 2015 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.