Jump to content

Wikipedia:Bots/Requests for approval/BunnysBot 2

From Wikipedia, the free encyclopedia

Operator: Bunnypranav (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 12:59, Saturday, November 23, 2024 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: Remove userpages from content categories listed at Wikipedia:Database reports/Polluted categories

Links to relevant discussions (where appropriate):

Edit period(s): Manual runs every week or so

Estimated number of pages affected: ~300 Every run

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: Removes user pages from content categories, like birth year, etc. from the listed database report at Wikipedia:Database reports/Polluted categories. I may do my own DB updates in my user space with the opensource code. Of course, it cannot be exculsion compliant as the cat should not be in that space.

Discussion

[edit]
  • Since you say it is automatic, how would you determine whether a category is meant for the mainspace or userspace? If I add Category:WikiProject tagging bots to Special:RandomPage, will it then be removed from the userspace pages with the category? DatGuyTalkContribs 01:55, 28 November 2024 (UTC)[reply]
    @DatGuy While making the lists based on the database report or SQL query, I will only add content categories to it, like birth/death cats for eg. Then, userspace pages from those cats will be removed. Adding cats to the list, i.e. is it a content cat or not, will be done manually to avoid such errors. ~/Bunnypranav:<ping> 10:20, 28 November 2024 (UTC)[reply]
    I have gone through some random content categories in this database report and haven't found any user pages in it. Did you mean that 300 pages would be fixable per week when the database report is updated, or is it up to you when you are running the bot? – DreamRimmer (talk) 15:55, 29 November 2024 (UTC)[reply]
    The database report runs infrequently compared to the stuff to fix in it. I shall run the SQL query in quarry:query/87967 before every run. 300 is a bit of high-end number, I am ready to fix as many or as few of pages available when I do a run. ~/Bunnypranav:<ping> 16:03, 29 November 2024 (UTC)[reply]
    Eg. of how a list is made. This petscan query shows user and user talk pages in some content cats from the quarry query, I shall go through these and disable them, i.e. [[:Category:XYZ]]. ~/Bunnypranav:<ping> 03:30, 30 November 2024 (UTC)[reply]
    Almost all of the pages on this list are subpages, and the DannyS712 bot also disables categories on userspace pages. While it mainly fixes pages that have draft or AfC templates, I am sure it helps with a fair number of pages each month that are part of this report. So, I think a weekly run would work well, as there should be about 40-60 pages for your bot to fix each week. I could be wrong, though. – DreamRimmer (talk) 11:05, 30 November 2024 (UTC)[reply]
    I am not fully convinced this is necessary; the Petscan provided shows ten sandboxes, which should have the cats commented out (or placed in {{draft categories}}) but not removed outright. Are there consistently categories that are used on main user pages or user talk pages? Primefac (talk) 16:58, 9 December 2024 (UTC)[reply]
  • See this petscan, it shows 107 results. Unless I missed removing a non-content cat, I think this qualifies for a bot task. I generally see many year of birth cats in userpages, and others cats in sandboxes. Clarification: I shall disable all occurences of such content cats using [[:Category:XXXXX]]. ~/Bunnypranav:<ping> 12:27, 11 December 2024 (UTC)[reply]
    The petscan query you provided is empty. – DreamRimmer (talk) 12:39, 11 December 2024 (UTC)[reply]
    Oops, does this work? https://petscan.wmcloud.org/?psid=30328826 ~/Bunnypranav:<ping> 12:45, 11 December 2024 (UTC)[reply]
    It does, yes. Still seeing a lot of sandboxes (which can be filtered out once the list is made I suppose) but I think that filtering out talk pages would also be necessary; for example Special:Diff/1263439382 would have been an inappropriate removal as it needed to be piped. The more I see this the more issues I'm seeing with context whittling down the values to something that probably should just be a manual AWB task run occasionally (that is not me declining it, and I'm still open to the idea, but I'm seeing fewer and fewer reasons to make this an automated bot process). Primefac (talk) 17:43, 16 December 2024 (UTC)[reply]
    In the first Petscan query there were 36 results in one week and in the next week there were 107 pages, so there are enough pages to address with a weekly run. There is no consistency in the categories as newusers use different categories depending on the topic of their creations and the majority of these categories are used in user sandboxes or user subpages. Some of these pages are fixed by the DannyS712 bot but since it only handles pages with AfC templates, most sandbox pages without AfC templates remain unfixed. Although most of the work is manual such as querying the database to get content categories, finding userspace pages using those categories, and then fixing them via AWB, also the number of pages is relatively low, so I agree with you that it can be done with main account using AWB. – DreamRimmer (talk) 18:02, 16 December 2024 (UTC)[reply]
    @Primefac I plan to change the cats just like the diff you linked above. I didn't get why sandboxes need to be filtered out, for sake of simplicity and reduction in errors, I plan to [[:Category out the links for all types of pages, talk or not.
    Also, if talk pages are edited as well, doing it with a bot prevents the New Message notification, at this point I think that a bot task is warranted for the userpages as well. ~/Bunnypranav:<ping> 15:14, 17 December 2024 (UTC)[reply]