Wikipedia:Bots/Requests for approval/CircularRedirectsBot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard. The result of the discussion was Request Expired.
New to bots on Wikipedia? Read these primers!
- Approval process – How this discussion works
- Overview/Policy – What bots are/What they can (or can't) do
- Dictionary – Explains bot-related jargon
Operator: Magnus Manske (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 11:17, Wednesday, November 30, 2022 (UTC)
Function overview: The bot finds pages with links to a redirect page that links back to the original page:
- [[Page A]] links to [[Page B]] which redirects to [[Page A]]
The bot will try and replace the link in question with plain text.
Automatic, Supervised, or Manual: Automatic
Source code available: https://bitbucket.org/magnusmanske/magnustools/src/master/scripts/circular_redirects/circular_redirects.php
Links to relevant discussions (where appropriate): Diff from a recent circular redirect discussion
Edit period(s): Daily or weekly
Estimated number of pages affected: There are ~300K pages that have circular redirect links, but only ~10% (rough estimate) have a "simple" case that can be addressed by the bot as it is now. Capabilities to solve more complex cases might be added in the future.
Namespace(s): Main
Adminbot No
Function details: Example edit, all test edits.
Discussion
[edit]- Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT⚡ 11:23, 30 November 2022 (UTC)[reply]
- Magnus, please do not run the bot again until it has approval to edit. Primefac (talk) 11:45, 30 November 2022 (UTC)[reply]
- Could you please point to a discussion where this is seen as a Good Thing? I seem to recall discussions in the past where circular redirects were usually acceptable as they indicated an {{r with potential}} type situation. Primefac (talk) 11:45, 30 November 2022 (UTC)[reply]
- I think that would depend on who you are discussing the matter with. (I'm actually responsible for prompting Magnus about this problem.) I think that circular redirects are worse than useless. For a reader who clicks on one, there is frustration, just as bad as a page self-link. They probably click again, using the servers uselessly. Where the circular redirect is created from a redlink, rather than a stub being created, WP loses a growth point. I do not buy the argument that {{r with potential}} is any sort of substitute for a redlink, in terms of getting articles created.
- Talking to people who've considered the issue solely from a technical point of view, it seems this an "old chestnut" - no obvious fix. Looking at it socially, there is indeed no fix that does not undo some good-faith edits. But there is a large backlog, now affecting 4% of all articles I believe.
- If the backlog can be cleared, I hope we can move onto a more sensible approach. By that I mean this issue is too large to be referred to Redirects for Discussion in each case. There should be some triage, because some of the redirects created are not that useful, as some of the (red)links introduced are unhelpful. But there has to be an initial clearance. Charles Matthews (talk) 15:57, 30 November 2022 (UTC)[reply]
- As a small data point, I'll add that WP:XFDC unlinks circular redirects when you close a RfD as retarget. Legoktm (talk) 22:11, 30 November 2022 (UTC)[reply]
- Why isn't it better to leave a redlink than to remove the link completely? Mike Christie (talk - contribs - library) 12:16, 1 December 2022 (UTC)[reply]
- A redlink to what? A => B => A, removing link A => B, leaving plain text behind. Magnus Manske (talk) 16:10, 1 December 2022 (UTC)[reply]
- I was thinking that since a circular redirect isn't red and hence appears to not require an article to be created, it would be better to make it into a red link. Of course that's nothing to do with the wikitext in the article with a redirect, it's a function of whether there's a page (redirect or not) at the target of the link. The bot would have to delete redirect pages, not edit links, to make this happen, and I understand that is not what this bot is designed to do. Mike Christie (talk - contribs - library) 22:03, 2 December 2022 (UTC)[reply]
- A redlink to what? A => B => A, removing link A => B, leaving plain text behind. Magnus Manske (talk) 16:10, 1 December 2022 (UTC)[reply]
- For the avoidance of doubt, this bot is not for removing redirects. Charles Matthews (talk) 21:29, 2 December 2022 (UTC)[reply]
- What about pages that link to a page which itself links to a sub-section on the original page? ProcrastinatingReader (talk) 21:34, 13 December 2022 (UTC)[reply]
- Noting that I've mass rollback'd the test edits, as several of them contained errors (where links contained a pipe, the replacement did not remove the pipe) ProcrastinatingReader (talk) 16:58, 16 December 2022 (UTC)[reply]
- @Magnus Manske: I've got a couple of random comments:
- I'm generally opposed to using regex to parse wikitext. It's always tempting, but it's usually more complicated than it appears at first, and I strongly suspect wikitext is theoretically impossible to parse correctly in the general case with a regex. The kinds of errors spotted by ProcrastinatingReader will keep cropping up. This kind of thing should be done by a real wiki parser. I don't know what parsing tools are available in PHP, but Parsoid is always an option.
- I'm not familiar with the history, but it sounds like this is something which has been considered before and rejected. Perhaps a slightly different take would be useful, however. Use the same code to detect when this happens, only on recent edits. Then have the bot drop a note on the talk page of the person who created the cycle: "This recent edit of yours <include link to diff> created a circular redirect. That's not always a problem, but it can be. Please take a look and see if the link you added is correct". Adjust the wording as appropriate. Keep track of how many of those alerts result in the link being removed, and come back with statistics which will tell us if this is actually useful or not. Or perhaps expose some deeper pattern which can be used to filter which cycles are OK and which are not. -- RoySmith (talk) 13:59, 28 January 2023 (UTC)[reply]
- This might be helpful as a multi-fold approach, à la the status quo for disambiguation pages:
- Talk page notifications: à la DPL bot 2
- Editing UI notifications: à la phab:T285508
- Link coloring: à la linkclassifier
- Change tags: à la phab:T287549
- Database report: à la 1, 2, 3
- Potentially: automated removal (no counterpart for disambigs)
- Best, EpicPupper (talk) 04:59, 5 February 2023 (UTC)[reply]
- Most cases on enwiki are ordinary wikilinks. These seem straightforward to handle with regex, since page titles cannot contain pipes, square brackets, or number signs. For example, the redirect page title Jupiter (planet) can be linked from any name matching
\s*[Jj]upiter \(planet\)\s*
; the variations contain leading or trailing spaces and/or a lowercase initial letter. Hypothetically, we would remove links from the target (Jupiter) using the following regex:\[\[(\s*[Jj]upiter \(planet\)\s*)\]\]
→$1
\[\[\s*[Jj]upiter \(planet\)\s*\|(.*?)\]\]
→$1
- The tester's mistake was in forgetting to remove the pipe when delinking a piped link.
- I agree that a wikitext parser will be less error-prone, but all of the examples in the test edits could be handled using the above regex. If a circular link remover for such cases is working correctly, then the page User:LaundryPizza03/CircularRedirectTest should become identical to User:LaundryPizza03/CircularRedirectTest/expected result. –LaundryPizza03 (dc̄) 06:56, 2 March 2023 (UTC)[reply]
- A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) Still awaiting comments from the bot owner, before we can begin a second trial. –LaundryPizza03 (dc̄) 01:51, 27 May 2023 (UTC)[reply]
- Most cases on enwiki are ordinary wikilinks. These seem straightforward to handle with regex, since page titles cannot contain pipes, square brackets, or number signs. For example, the redirect page title Jupiter (planet) can be linked from any name matching
Request Expired. BotOp has not posted here since December (and only made 4 edits in total), and thus the discussion has pretty much stalled out. Primefac (talk) 08:50, 7 June 2023 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard.