Jump to content

Wikipedia:Request a query/Archive 5

From Wikipedia, the free encyclopedia
Archive 1Archive 3Archive 4Archive 5

I realize PetScan can be used to show links that appear on two pages, but I have a need for it to show such a result sorted by the order in which they appear on the first page. Based on my limited understanding of the database, I don't think the order of the links on a page are tracked. But just in case I'm wrong, can someone show me a query to do this? Or suggest an alternative approach? Thanks. Stefen Towers among the rest! GabGruntwerk 23:49, 22 June 2024 (UTC)

No, that's not possible; Quarry doesn't contain information about the content of the page.
Could you give some additional information about what you want this for? There are several alternative approaches that would work, but I need to know a little more about what you want it for. BilledMammal (talk) 23:53, 22 June 2024 (UTC)
I thought Quarry knows the links on a page, so that's where I was coming from. Anyway, I have a page that shows links sorted by popularity (views, descending) and another page that shows links to articles having old issues. I'd like an intersection of the links between them, in order of popularity (order of appearance on first page). Stefen Towers among the rest! GabGruntwerk 00:03, 23 June 2024 (UTC)
It can see which pages a given page links to (or is linked from), but doesn't have any information on what order it's done in - that would need the page content.
Is this a one-time query, or will you need it repeated? —Cryptic 00:08, 23 June 2024 (UTC)
I'd like for it to be repeatable as the underlying pages will change. I'm fine with having to run it manually. Stefen Towers among the rest! GabGruntwerk 00:10, 23 June 2024 (UTC)
(edit conflict) It does, but it doesn't have information beyond that, such as about the text of the page.
I'm not aware of any tools that can help you with that, but I threw together the information you wanted using a quick and dirty script:
Extended content
  1. Deion Sanders
  2. Ford Explorer
  3. Maurice Lucas
  4. Josh Hamilton
  5. Fort Knox
  6. Diane Sawyer
  7. Aroldis Chapman
  8. Secretariat (film)
  9. UPS Airlines
  10. Joe Torre
  11. Presbyterian Church (USA)
  12. Damaris Phillips
  13. Jim Beam
  14. Louis Brandeis
  15. Jack McCall
  16. My Morning Jacket
  17. Carlton Fisk
  18. David Pajo
  19. Adam Dunn
  20. Kentucky Colonels
  21. Humana
  22. Earl Weaver
  23. Pope Lick Monster
  24. Meriwether Lewis Clark Jr.
  25. John Marshall Harlan
  26. B. Brian Blair
  27. Frank Ramsey (basketball)
  28. Interstate 71
  29. A. J. Foyt IV
  30. Oldham County, Kentucky
  31. Susanne Zenor
  32. Andy Van Slyke
  33. Harvey Fuqua
  34. Dan Uggla
  35. Homer Bailey
  36. Louisville Cardinals
  37. Aristides (horse)
  38. Playa (band)
  39. Greg Page (boxer)
  40. Terry Pendleton
  41. Kentucky Derby Festival
  42. Louisville Metro Police Department
  43. 5th Cavalry Regiment
  44. Taylor Nichols
  45. David Grissom
  46. Valley of the Drums
  47. Ward Hill Lamon
  48. Jefferson C. Davis
  49. Robert Nardelli
  50. Jim Caldwell (American football)
  51. John Cowan
  52. Mildred J. Hill
  53. Johnny Edwards (musician)
  54. Lance Burton
  55. IWA Mid-South
  56. Mickie Knuckles
  57. Run for the Roses (song)
  58. Taeler Hendrix
  59. Sovereign Grace Churches
  60. Fabian Ver
  61. Tori Hall
  62. Larry Collmus
  63. New Grass Revival
  64. Rebel (bourbon)
  65. 2011 Kentucky Derby
  66. Bullitt County, Kentucky
  67. Catherine McCord
  68. Shelley Duncan
  69. Dan Boyle (ice hockey)
  70. History of Louisville, Kentucky
  71. Rudy Rucker
  72. Big Four Bridge
  73. Optimist International
  74. Larnelle Harris
  75. C. J. Mahaney
  76. Thunder Over Louisville
  77. Belle of Louisville
  78. Bertha Palmer
  79. Sports in Louisville, Kentucky
  80. Gary Matthews Jr.
  81. George Devol
  82. John Yarmuth
  83. Travis Stone
  84. June of 44
  85. Ted Washington
  86. Larry Elmore
  87. Parents Involved in Community Schools v. Seattle School District No. 1
  88. Interstate 64 in Kentucky
  89. Corey Patterson
  90. Stith Thompson
  91. Roman Catholic Archdiocese of Louisville
  92. Louisville Zoo
  93. Boyce Watkins
  94. James Speed
  95. Jefferson County Public Schools (Kentucky)
BilledMammal (talk) 00:23, 23 June 2024 (UTC)
I see you want something that can run repeatedly. I don't have time right now to put something together for you, but if Cryptic doesn't come up with something I'll do it sometime in the next couple of weeks - if I don't, feel free to remind me on my talk page. BilledMammal (talk) 00:24, 23 June 2024 (UTC)
Thanks for the list - that's good for a start. Would you mind giving me a few clues on your approach for the script you did? It might snap me into figuring it out. Also, I have thought of using a spreadsheet or text compare software, but was hoping for an on-wiki or otherwise online approach. Stefen Towers among the rest! GabGruntwerk 00:40, 23 June 2024 (UTC)
A few Regex operations to get just the links in order, and then a basic python script that works down the first list and if the item exists on the second outputs it. Unfortunately, nothing online ATM. BilledMammal (talk) 00:49, 23 June 2024 (UTC)
Thanks for the tips! Stefen Towers among the rest! GabGruntwerk 00:50, 23 June 2024 (UTC)
Any pure-sql-against-the-wmf-databases approach would have to start with something similar to either "Make a page in your userspace with redlinks to 1!Tom Cruise, 2!Muhammad Ali, ... 1000!Frank Torre" or "manually edit this stupidly long query that includes all that data" (like how quarry:query/81948 includes the namespace names, but with a thousand items instead of 30). —Cryptic 01:04, 23 June 2024 (UTC)
Indeed, those don't seem like tenable approaches. But this discussion has helped me figure out a solution, not optimal but workable:
  1. Copy popular articles wikitext into a text editor, and break down into a flat list using a macro I constructed with regex and other tricks.
  2. Use PetScan to create a flat list of articles with old issues (this didn't have to be in any particular order).
  3. Insert both flat lists into their own column in a spreadsheet, then find matches of first column entries in the second column, then apply a filter of matches, and voila.
Stefen Towers among the rest! GabGruntwerk 04:18, 23 June 2024 (UTC)

Redirects with Possibilities

Any way for someone to provide me a list from Category:Redirect-Class Green Bay Packers articles of redirects with possibilities ({{R with possibilities}}). Thank you! « Gonzo fan2007 (talk) @ 19:19, 27 June 2024 (UTC)

My first thought was WP:PETSCAN, but it looks like Category:Redirect-Class Green Bay Packers articles is on talk pages, whereas Category:Redirects with possibilities is on article pages. So yeah, will need an SQL query. Will see if I can whip something up. Testing note: 1994 NFC Wild Card playoff game (Detroit–Green Bay) meets the criteria and should appear in the result set. –Novem Linguae (talk) 19:46, 27 June 2024 (UTC)
Got it. Here you go. quarry:query/84446. –Novem Linguae (talk) 19:58, 27 June 2024 (UTC)
Thank you Novem Linguae! « Gonzo fan2007 (talk) @ 21:07, 28 June 2024 (UTC)

First edit

In 2013, I heard that about a quarter of newcomers' first edits were to create a page, and three-quarters were to edit an existing page. (My own first edit was to a Talk: page, which was a distinctly unpopular choice.)

For editors who made their first edit during 2023, is it possible to find out how many of those first edits were page creations vs editing existing pages, and which namespaces they happened in? I imagine a set of numbers like this:

Namespace New page Existing page
Article 5339 24209
Talk 249 3590
User 1934 781
User talk 429 9045
Draft 1930 89

all adding up to 100% of first edits. WhatamIdoing (talk) 06:06, 29 June 2024 (UTC)

Lots of ways for this to go wrong. Consider deleted edits, and that there's no way to reliably tell where a page was when an edit was made to it (unless it's also in the page creation log, but that still means at least half the data is bad). Also no way to tell which of two deleted edits with the same timestamp was actually earlier, but a user making a second edit in the same second as their first is going to be rare enough that we can just pick one. Getting a list of users whose first edit was in 2023 is impractical; a list of users created in 2023 and have made at least one edit is at least close, so that's what I'm doing. quarry:query/84486 should finish in about 15 minutes. It may or may not be right - I haven't seen the results yet - but I've got to get to bed. Will take another look tomorrow. —Cryptic 07:47, 29 June 2024 (UTC)
Hm, corrected query finished a lot faster than the obviously-broken ones I ran before (which is usually to be expected, but I didn't think the data would cache that well). Something's not quite right - there shouldn't have that row with the completely blank namespace and 52458 existing-page edits, for starters, and that one creating a page in the MediaWiki namespace looks really suspicious - but otherwise the numbers seem at least plausible. —Cryptic 07:59, 29 June 2024 (UTC)
Pretty sure I know why the blank namespace happened, should be fixed when it completes again. The first-edit-was-to-namespace-8 was at MediaWiki:Campaigns-event-discovery-survey-question. —Cryptic 08:15, 29 June 2024 (UTC)
If we ran that for, say, last week, would that give us an estimate of how many first edits are being 'lost' or 'misplaced' due to deletion and moving pages? WhatamIdoing (talk) 03:34, 30 June 2024 (UTC)
Well, I can run the numbers (quarry:query/84512), but I don't know how much to infer from the comparison with the longer time period. The only edits that are going to be 'lost' are ones that have been revdeleted or oversighted, and in most cases - other than, perhaps for copyvio revdels - that will happen pretty quickly after they're made. I wouldn't care to guess how quickly a typical page gets moved between namespaces after creation, either. —Cryptic 08:17, 30 June 2024 (UTC)
Does it only lose revdel edits, and not ordinary/whole page deletion?
A lot of deletions for WP:UGLY articles happen after draftification, and borderline articles often don't get draftified for a couple of weeks. WhatamIdoing (talk) 01:00, 1 July 2024 (UTC)
Just revdelled and suppressed edits. Everything about deleted pages and edits except for edit summaries and page text is in the public database replicas; accounting for those is most of why the query is as complex as it is. —Cryptic 10:52, 2 July 2024 (UTC)
So here are a few things I notice:
  • Last week, 4,754 newbies made their first edit on an existing page and 1,842 newbies made their first edit to a new page.
  • Last week, if your first edit was to an existing page, then about 90% of the time, it was to the [i.e., a page most recently in] mainspace.
  • Last week, if your first edit was to create a new page, then about 70% of the time, it was to the User: namespace and about 20% of the time, it was to the Draft: space.
  • Within the first week, about a third of User: space pages and a third of Draft: space pages get deleted.
  • In 2023, 342,087 newbies made their first edit to an existing page and 144,181 newbies made their first edit to a new page.
  • In 2023, if your first edit was to an existing page, then about 90% of the time, it was to the mainspace.
  • In 2023, if your first edit was to create a new page, then about 70% of the time, it was to the User: namespace and about 20% of the time, it was to the Draft: space.
  • Within the last 18 months, looking at pages created on the first edit during 2023, about a third of User: space pages and almost 95% of Draft: space pages were deleted.
    • Conclusion from last week vs 2023: If your User: space page is going to get deleted, it'll happen in the first week. Draft pages are much more likely to get deleted, but it takes longer.
  • Within the last 18 months, only 1.6% of pages created as the account's first-ever edit were moved to the mainspace. 20% of those were also subsequently deleted. One in 75 first-edit page creations are still visible in the mainspace.
    • Actual User: pages probably aren't meant to get moved to the mainspace, and the existence of these pages depresses the overall 'success' numbers. An unknown proportion User sandbox pages probably are meant to move to the mainspace, while others are probably meant to be used for test edits. We don't have a good way to differentiate between these two types of user space contribution.
WhatamIdoing (talk) 17:17, 2 July 2024 (UTC)
Within the last 18 months, looking at pages created on the first edit during 2023, about a third of User: space pages and almost 95% of Draft: space pages were deleted. - this is a shocking statistic but if I understand correctly it can't be taken at face value because the namespace is where the page is now, right? That is, 95% of drafts that were never moved to mainspace are deleted, which is explicable in terms of G13 (I assume the 5% is drafts less than six months old?) – Joe (talk) 20:35, 2 July 2024 (UTC)
@Joe Roe (mobile), all of the drafts created in 2023 are more than six months old at this point. There were probably one or two drafts that were a few hours short of being six months old when the query was run, so it's like 0.05%. The 5% that have been retained were probably created later in the year and had an edit made (by anyone) since then. For example, the 5% probably includes articles that were submitted to AFC in December, declined in January, and will be deleted – just not quite yet.
The namespace listed is the namespace in which either the page exists now (more specifically, at 30 Jun 2024 08:21:10 UTC) or where the page existed at the time that it was deleted (e.g., ordinary CSD or AFD, not revdel or OS/suppression – those latter ones are invisible to the query). WhatamIdoing (talk) 21:14, 2 July 2024 (UTC)

Articles that are missing in another language

I'm hoping to get a list of articles that are tagged as being in Wikipedia:WikiProject Climate change and that do not exist in the Japanese Wikipedia. Any help from the lovely volunteers here would be very much appreciated. Cheers, Clayoquot (talk | contribs) 15:03, 16 July 2024 (UTC)

This can't be done with a sql query - interwiki links aren't in the main database, and jawiki is on a different database server than enwiki. Someone more familiar with WP:PetScan than I might be able to get a result with it. Also worth asking at d:WD:RAQ; they do have access to the interwikis over there, and I seem to recall there's a way to get at wikiproject ratings as well. —Cryptic 17:13, 16 July 2024 (UTC)
interwiki links aren't in the main database - aren't they in langlinks? * Pppery * it has begun... 17:20, 16 July 2024 (UTC)
Erm. Wow. That's egg on my face. I'm sure I've looked for them before, and had thought that table went away when language links were migrated to Wikidata. Hang on. —Cryptic 17:25, 16 July 2024 (UTC)
quarry:query/84886. —Cryptic 17:34, 16 July 2024 (UTC)
Amazing! Thank you so much Cryptic and Pppery! Clayoquot (talk | contribs) 19:43, 16 July 2024 (UTC)

Find pages removed from a category

Is it possible to find a list of changes to a category's members, that are older than RecentChanges? Asking for Wikipedia:Categories for discussion/Log/2024 July 5#Category:Athletes by location in Greece. — Qwerfjkltalk 15:42, 17 July 2024 (UTC)

I'm pretty sure not, unless it was depopulated by a bot, or was old enough that you can find the cat page with the Wayback Machine. —Cryptic 19:27, 17 July 2024 (UTC)

Composition categories without templates

Hi all, if possible, I'd like to request a list of any subcategories in Category:Compositions by composer (which have more than 2 entries) in which there is no corresponding navigational box created for said composer's works.

So essentially, I'd be looking for categories like Category:Compositions by Example Person which have more than 2 pages, but there is no corresponding {{Example Person}} that exists. I'd then go through the list and create the missing templates myself.

Thanks – Aza24 (talk) 22:35, 28 July 2024 (UTC)

quarry:query/85143. —Cryptic 23:24, 28 July 2024 (UTC)
At least one false negative that I noticed by accident: Category:Compositions by Dmitry Bortniansky isn't on the list despite Template:Dmitry Bortniansky not existing, since it only has one member, Category:Operas by Dmitry Bortniansky, even though that category has two members. Trivial to rerun the query without the minimum-number-of-members constraint, while still showing a count of members (and a list of the members' namespaces) if you want. —Cryptic 23:31, 28 July 2024 (UTC)
Thank you so much @Cryptic, that should work well. Question: will this link stick around or should I copy the results somewhere for my own use? Aza24 (talk) 00:24, 29 July 2024 (UTC)
It'll stay in place (and the results won't change even when the underlying data here does). It would probably be convenient to copy it into a sandbox page, though, so you can click on the links directly. The cyan "Download data" dropdown has an option for wikitable format. —Cryptic 00:29, 29 July 2024 (UTC)
Yes, good point! Thanks again. Aza24 (talk) 04:28, 29 July 2024 (UTC)

Find duplicated file licensing templates

Hello, I'd like a query for every file that use any of these templates twice (the same template twice, not one usage of one template and one usage of another). For example File:Madonna Frozen Sickick.png should be one of the results. Thanks! Jonteemil (talk) 13:47, 27 July 2024 (UTC)

Not possible with a query: the table showing transclusions doesn't have duplicates (it's the same one used to generate Special:Whatlinkshere), and the replicas don't have the page text. Best I can think of is using search one template at a time, like so. —Cryptic 18:58, 27 July 2024 (UTC)
Okay, thank you. Jonteemil (talk) 15:04, 7 August 2024 (UTC)

Heavy uses of Template:IETF RFC

I'm interesting in finding the articles that make heavy use of Template:IETF RFC. Would it be possible to rank the top-10 articles with the most calls ("transclusions"?) of that template, please? Thanks! fgnievinski (talk) 01:20, 15 August 2024 (UTC)

Not here, for the same reason as #Find duplicated file licensing templates above. —Cryptic 01:30, 15 August 2024 (UTC)
It might be feasible to do it manually, though - there's only 817 articles that transclude it. You could scrape the wikitext for all of them, parse them for transclusions, and count those. There's only a couple redirects to confuse things, though if there's any indirect transclusions (i.e., through a different template), that'd make things harder. —Cryptic 01:35, 15 August 2024 (UTC)
Understood, thank you. I've asked around:
fgnievinski (talk) 02:35, 15 August 2024 (UTC)
Both of those projects are almost certainly pulling their data from the same place, hence with the same limitations. —Cryptic 03:01, 15 August 2024 (UTC)
Thanks for the heads-up. I asked because I'm not fluent in Wiki scrapping. fgnievinski (talk) 03:22, 15 August 2024 (UTC)
As above, search can sort of do this. this looks for "{{", zero or more spaces, "ietf", any single character, then "rfc", case insensitive, with all of that occurring at least forty times in the same mainspace page (both redirects to {{IETF RFC}} with any transclusions at all match that pattern). It finds 13 articles. It could very well be missing some, though; for example, {{IETF RFC|1234}} with extra spaces in the middle would render normally but not be searchable by this. —Cryptic 03:34, 15 August 2024 (UTC)
Wow wow, that's awesome, many thanks!!! fgnievinski (talk) 04:17, 15 August 2024 (UTC)

AfD and article deletion statistics

Hello lovely volunteers. I'm trying to calculate 1) The number of articles deleted in 2022, and 2) The percentage of article deletions in 2022 that were the result of an AfD discussion. I think existing queries do this but would appreciate it if someone could check my interpretation of the queries.

Here are my assumptions:

  1. My definition of an "article" is a mainspace page that is not a redirect. I don't mind including dab pages, pages with no links, etc.
  2. I assume that all articles deleted as a result of an AfD discussion have "Articles for deletion" in the log reason
  3. From quarry:query/78694 I see that in 2022, 109,583 mainspace pages were deleted in 2022
  4. From quarry:query/78460 I see that in 2022, 37,297 pages were deleted with "redirect overwrite" in the log reason
  5. From quarry:query/78460 I see that in 2022, 5380 pages were deleted with "Redirects for discussion" in the log reason
  6. From quarry:query/78460 I see that in 2022, 13,635 pages were deleted with "Articles for Deletion" in the log reason
  7. I assume that the number of mainspace redirects not accounted for in the above queries is negligible or too difficult to get (is this true?)

Therefore it appears that:

  • In 2022, 66,906 articles were deleted (109,583 - 37,297 - 5380), an average of 183 per day
  • In 2022, 13,635 articles were deleted via AfD, an average of 37 per day
  • In 2022, 20% of article deletions were the result of an AfD discussion.

Does this look right? Clayoquot (talk | contribs) 19:38, 26 August 2024 (UTC)

#3 and #4 are accurate; #5-7 aren't. And #3 isn't either if you'd count a move out of mainspace without leaving a redirect as deletion.
Query 78460 doesn't do quite what you think it does. It's not showing the total number of deletions whose logs mention A7, the total number that mention AFD, etc; it tries to assign each deletion to a single reason in a given order. So if, say, a page was deleted with comment "Wikipedia:Articles for deletion/Spacely Sprockets; also a WP:CSD#G11" it would be counted as G11 and not AFD.
And there's no way to find out if a page was a redirect or not when it was deleted, other than if the comment mentions an R-series criterion (which that query doesn't look for) or RFD, or if it's the automatic deletion during a page-move. You could conceivably look at the length of the most recent deleted revision, but there's lots of redirects with more bytes in them than lots of short articles. I'd have no confidence in any query of the public replicas that purported to accurately count the number of redirects deleted. But if you were trying for a SWAG, you can still do a lot better than this query - sum the automatic overwrites, plus the R-series speedies, plus the RFDs, plus some proportion of appropriate-looking G6s and G8s (G14s using its redirect clause aren't ever distinguishable from other G14s in my experience). —Cryptic 20:09, 26 August 2024 (UTC)
Excellent points, thanks. Re #6, I also just realized that when an article is deleted at AfD, its Talk page(s) and redirects to it are also deleted with "Articles for Deletion" in the log reason. The best way to count articles deleted via AfD is User:JPxG/Oracle as far as I can tell. According to that page, in 2022, the daily average number of AfD deletion discussions that resulted in “Delete” or “Speedy Delete” was 33. I'll work on refining this. Cheers, Clayoquot (talk | contribs) 23:50, 27 August 2024 (UTC)

Active admins

I'm trying to figure out how many of the admins listed at Wikipedia:Active admins are also making more than a thousand edits a month. See WT:RFA for why I'm interested in doing that. Clovermoss🍀 (talk) 22:28, 26 August 2024 (UTC)

56, give or take. That's not quite what you're asking; it's the number of current admins (including adminbots) with more than 1000 edits in the last 30 days. I didn't crossref the "active admins" page since they should be mostly the same, give or take recent desysops (Pppery would've just missed the list with 956 edits) or re-activations. Going back more than 30 days is much slower, roughly five or six minutes per month (including the first). —Cryptic 23:07, 26 August 2024 (UTC)
Okay, thanks. Clovermoss🍀 (talk) 23:08, 26 August 2024 (UTC)
By the way, 1000 edits a month is a lot. I used to hit this back when I was backlog crushing in 2021. Xtools. Nowadays I only watchlist, then spend the rest of my time on coding which doesn't really increase my edit count much, and I no longer hit 1000 edits a month, even though I edit every day. So be careful, such a high threshold may exclude some active admins such as myself who spend hours on wiki every day. –Novem Linguae (talk) 10:17, 27 August 2024 (UTC)
I agree. I'm rarely in the 1,000+ edits a month category myself. [3] If we have good admins that don't have crazy high edit counts once they get the bit, it stands to reason that we're missing out on some good potential admins due to editcountitis. The whole point of this query was inspired by my essay and my comments about it in the above thread. Clovermoss🍀 (talk) 10:25, 27 August 2024 (UTC)
Groovy. If it helps, I think the de facto edit counts to pass RFA nowadays are 8000 total (due to 0xDEADBEEF passing with this recently, and no one passing with anything lower since GoldenRing in 2017) and a couple hundred edits a month for the last 6 months or so (enough to show that you aren't inactive). –Novem Linguae (talk) 17:05, 27 August 2024 (UTC)
My argument is that a couple hundred edits a month... to show that you aren't inactive may be part of the problem. Clovermoss🍀 (talk) 17:08, 27 August 2024 (UTC)
@Clovermoss: If you have a candidate for adminship in mind I'd be happy to show them how to fix typos on a large scale. There are many tasks (that may be a bit behind the scenes and boring) that require making many edits. Polygnotus (talk) 17:24, 27 August 2024 (UTC)
Whereas I've averaged 1000+ edits per month almost every year since getting the bit, and I don't appear on Cryptic's list of 56. Been doing a lot of gardening this summer, I guess. :D Valereee (talk) 11:41, 28 August 2024 (UTC)
I'm on the list but there's only been 4 months since 2018 where I've hit the 1000+ threshold. Clovermoss🍀 (talk) 11:43, 28 August 2024 (UTC)
Valereee, your count from a similar query without the 1000-edit cutoff is 584. That's roughly correct looking at your contributions; it's a simple count of edits in the past 30 days. It won't have counted the two deleted edits you have from that period. —Cryptic 12:30, 28 August 2024 (UTC)
Oh, yes, I typically do have fewer edits this time of year. I was more thinking of averages -- that is, if we're looking for potential candidates who average (say) 1000 edits per month, we wouldn't capture even someone who averages more than 1000 but not in the past month, if you see what I mean. I tend to do more editing when it's dark 13 hours a day, so in January I'd almost definitely be captured. In a typical August, possibly not. Valereee (talk) 12:47, 28 August 2024 (UTC)
Well, by happy coincidence, I just ran this query. It shows admins (again, only users who currently have the bit) who have at least 1000 live edits from at least one of the past four calendar months, i.e. April-July. 100 qualified in at least one month; 60 of them, including at least three bots, averaged over 1000 for the entire period. —Cryptic 12:50, 28 August 2024 (UTC)
Oh, wow, that is fascinating, thank you! 3 our of 5 who made 1000 edits at least once in the past four months also averaged more than 1000 over that period! People who are in are in, I guess?
Could we run a similar for non-admins with, say, at least a year's experience, at least 10K edits, and 1000 edits at least once over the past four months, or would that be a headache/turn out a way too huge number? Valereee (talk) 13:04, 28 August 2024 (UTC)
It'll come up at quarry:query/85881 eventually. That's going to be a lot slower, since it has to look at all edits in that time period instead of just those made by the 854 current admins. —Cryptic 13:13, 28 August 2024 (UTC)
Oh, very cool, thank you so much! Valereee (talk) 13:20, 28 August 2024 (UTC)

section headers starting with a lowercase "w" followed by a uppercase letter or a number

I discovered something weird, see Wikipedia:Help_desk#Many_W's. Is it possible to run a query or write a regex search or whatever that meets these criteria:

Finds section headers (no matter how many levels deep) that start with a lowercase "w" immediately followed by a uppercase letter or a number. Polygnotus (talk) 15:05, 28 August 2024 (UTC)

The database replicas don't have article text, so the only place these would show up is if there are redirects to those sections.
Regex searches aren't indexed, so don't work well on their own; you have to pair them with something that is, like a category or template or outgoing link or normal search. But if you had that, I'd think insource:/== *w[A-Z0-9]/ would find just about all of these; it'd miss weird space characters, which should be uncommon anyway, and level-1 headers, where you'd get too many false positives with template parameters. Example search, paired with "meters". —Cryptic 15:23, 28 August 2024 (UTC)
Thank you! I think I have them all. I am not worried about level 1 headers because I think that every header that follows this pattern is a subheader of something like "Women's events" or "Women's results" or similar. Polygnotus (talk) 15:34, 28 August 2024 (UTC)

All categories with "LGBT" in their title

Would it be possible to get a list of all categories with LGBT (without a Q) in their title? They all need to be WP:C2D-renamed to change from LGBT to LGBTQ following Talk:LGBTQ#Requested move 14 August 2024, and a query which can be redone at will to get the currently-outstanding list would be helpful. Thanks, HouseBlaster (talk • he/they) 03:08, 31 August 2024 (UTC)

Try this. Let me know if it needs adjusting :) –Novem Linguae (talk) 03:22, 31 August 2024 (UTC)
As always, you are amazing, Novem! Exactly what I had in mind. Thank you so much :) HouseBlaster (talk • he/they) 03:30, 31 August 2024 (UTC)
@Novem Linguae: would it be possible to add a check to make sure the category is not a member of Category:All categories for discussion (and thus is not already nominated)? Thanks, HouseBlaster (talk • he/they) 03:14, 1 September 2024 (UTC)
Here you go. I've also made the check against 'LGBT' (but not 'LGBTQ') case-insensitive, which finds a handful more around Category:British lGBT entertainers and Category:Irish lGBT entertainers. —Cryptic 03:58, 1 September 2024 (UTC)
Thank you, Cryptic!! HouseBlaster (talk • he/they) 04:14, 1 September 2024 (UTC)

Typical page views

I would like a sentence in Wikipedia:Statistics#Page views that says something like "As of 2024, the median page views for an article is 6 per day, with n% of articles falling into the range of 3 to 50 page views per day" (where 50% ≤ n ≤ 90%, to show the middle range). I think a sample from 10,000 articles (e.g., User:BilledMammal/Average articles) or even just 1,000 would be sufficient.

I have found Wikipedia:Does Wikipedia traffic obey Zipf's law? but it's based on very old data. WhatamIdoing (talk) 18:41, 10 September 2024 (UTC)

Wikipedia:Request a query/Archive 3#Shortdesc query looks similar. WhatamIdoing (talk) 19:04, 10 September 2024 (UTC)
View counts aren't exposed in the public database replicas AFAIK, so this isn't possible other than by hitting the API once for each page. —Cryptic 19:24, 10 September 2024 (UTC)
You can try plugging quarry query 80241 or pagepile 60602 - they're the same sample of 10000 pages - into https://pageviews.wmcloud.org/massviews, but it only gave me data on about 3/4 of them and a ton of errors on the rest. —Cryptic 19:45, 10 September 2024 (UTC)
Thanks. I was able to get the pageviews by running it in two batches. I'm looking at the numbers now. WhatamIdoing (talk) 20:23, 10 September 2024 (UTC)
After excluding ~200 pages with 0 reported page views from 2023 (because they might have been created during 2024), I find: mean of 5,725 page views per year, median of 55 page views per year, mode of 1. 25% of articles get 3 page views per year(!) or less.  75% of articles get 570 page views per year — or less. Almost exactly 10% of articles average 10+ page views per day. WhatamIdoing (talk) 21:05, 10 September 2024 (UTC)

Cross Reference Request

Would someone be able to cross-reference the links in this list Green Bay Packers All-Time Roster with to verify that every target page in that list is also categorized in that category. The goal for me here is to make sure the links in the list are targeted to the right article. As an example, the list had Rex Smith, when it should have had Rex Smith (American football) as he correct target. Thus, this cross reference will identify any disambiguation that needs to occur on that page for links that aren't disambiguation pages (which I have already fixed). Thank you! « Gonzo fan2007 (talk) @ 22:14, 18 September 2024 (UTC)

quarry:query/86383. —Cryptic 01:11, 19 September 2024 (UTC)
Cryptic, thank you! The query appears to be pulling the links from {{Green Bay Packers}}, which is on the page. Any way to exclude the template in the query? « Gonzo fan2007 (talk) @ 17:53, 19 September 2024 (UTC)
Only if you don't mind if links that appear on both the template and the page itself are excluded, too. —Cryptic 00:26, 20 September 2024 (UTC)
I dont mind! Thank you! « Gonzo fan2007 (talk) @ 00:59, 20 September 2024 (UTC)
Same query, updated in-place. —Cryptic 01:12, 20 September 2024 (UTC)
Thank you Cryptic! « Gonzo fan2007 (talk) @ 17:43, 20 September 2024 (UTC)

submitted drafts

How do I check my submitted drafts — Preceding unsigned comment added by 79lives (talkcontribs) 12:07, 4 October 2024 (UTC)

Just currently-submitted ones, or ones that have been accepted, too?
If the first, you can see these by going to Special:Contributions/79lives, expanding "Search for contributions", picking "Draft" from the "Namespace" dropdown, and clicking the blue "Search" button. That gives you this.
If the second, Special:Log/create/79lives comes close. —Cryptic 13:09, 4 October 2024 (UTC)

how many transclusions of a list of templates

Yeah, I can repeatedly fill in the blank but I'd really rather not have to do that. But, I don't know anything about database queries; hence my request.

Given a list of templates, is it possible to fetch each template's transclusion count and return both the template name and the count for each one? The templates that I have in mind are a filtered subset of the templates listed at Category:Lang-x templates.

Trappist the monk (talk) 16:14, 6 October 2024 (UTC)

Yes. Easiest way to filter the list is to make a user subpage with links to each one you want counted. Or quarry:query/86855 for every template in that category. —Cryptic 18:56, 6 October 2024 (UTC)
The every template in that category query is just fine; I should have said that the category is lightly filtered so those that I don't care about can be removed by hand.
Thank you very much.
Trappist the monk (talk) 19:23, 6 October 2024 (UTC)

Office Actions across all wikis

Is there a way to run Q87329 on all the wiki-DBs without manually setting the database every time? Thanks, TrangaBellam (talk) 19:25, 21 October 2024 (UTC)

Ah, not possible, it seems. TrangaBellam (talk) 19:29, 21 October 2024 (UTC)
Not via Quarry, but I could do it via wikitech:PAWS. Excluding Meta, MediaWiki.org, and testwiki which aren't content projects so have lots of false positives, this gives the following results:
Extended content
commonswiki
1
b'20170618133428' b'18' b'06' b'2017' b'Freedom_of_Panorama_ZA' b'protect' b'Seddon (WMF)' b'a:4:{s:14:"4::description";s:112:"\xe2\x80\x8e[edit=sysop] (expires 13:34, 18 September 2017 (UTC))\xe2\x80\x8e[move=sysop] (expires 13:34, 18 September 2017 (UTC))";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:14:"20170918133428";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:14:"20170918133428";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:248247113;}'
enwiki
5
b'20241021032041' b'21' b'10' b'2024' b'Asian_News_International_vs._Wikimedia_Foundation' b'protect' b'WMFOffice' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:1252387069;}'
b'20241021030808' b'21' b'10' b'2024' b'Asian_News_International_vs._Wikimedia_Foundation' b'protect' b'WMFOffice' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:1252385692;}'
b'20221020124933' b'20' b'10' b'2022' b'List_of_prime_ministers_of_the_United_Kingdom_by_length_of_tenure' b'modify' b'Seddon (WMF)' b'a:4:{s:14:"4::description";s:124:"\xe2\x80\x8e[edit=autoconfirmed] (expires 12:49, 3 November 2022 (UTC))\xe2\x80\x8e[move=autoconfirmed] (expires 12:49, 3 November 2022 (UTC))";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:13:"autoconfirmed";s:6:"expiry";s:14:"20221103124933";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:13:"autoconfirmed";s:6:"expiry";s:14:"20221103124933";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:1117200886;}'
b'20170925233803' b'25' b'09' b'2017' b'Senford_High_School' b'unprotect' b'Jalexander-WMF' b'a:0:{}'
b'20150727021026' b'27' b'07' b'2015' b'Lois_Lee' b'protect' b'Philippe (WMF)' b'\xe2\x80\x8e[edit=sysop] (expires 02:10, 27 August 2015 (UTC))\xe2\x80\x8e[move=sysop] (expires 02:10, 27 August 2015 (UTC))\n'
foundationwiki
9
b'20190315192235' b'15' b'03' b'2019' b'Tax_Deductibility' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122763;}'
b'20190315192220' b'15' b'03' b'2019' b'Tax_Deductibility/ru' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122762;}'
b'20190315192208' b'15' b'03' b'2019' b'Tax_Deductibility/nl' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122761;}'
b'20190315192159' b'15' b'03' b'2019' b'Tax_Deductibility/ja' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122760;}'
b'20190315192150' b'15' b'03' b'2019' b'Tax_Deductibility/it' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122759;}'
b'20190315192142' b'15' b'03' b'2019' b'Tax_Deductibility/fr' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122758;}'
b'20190315192133' b'15' b'03' b'2019' b'Tax_Deductibility/es' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122757;}'
b'20190315192122' b'15' b'03' b'2019' b'Tax_Deductibility/de' b'unprotect' b'Pcoombe (WMF)' b'a:1:{s:17:"associated_rev_id";i:122756;}'
b'20171214175032' b'14' b'12' b'2017' b'Values' b'protect' b'Awjrichards (WMF)' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:112467;}'
frwiki
3
b'20240726050944' b'26' b'07' b'2024' b'Fran\xc3\xa7ois_Billot_de_Lochner' b'unprotect' b'WMFOffice' b'a:0:{}'
b'20240323002802' b'23' b'03' b'2024' b'Laurent_de_Gourcuff' b'protect' b'WMFOffice' b'a:3:{s:14:"4::description";s:57:"\xe2\x80\x8e[create=sysop] (expire le 21 mars 2026 \xc3\xa0 23:00 (UTC))";s:14:"5:bool:cascade";b:0;s:7:"details";a:1:{i:0;a:3:{s:4:"type";s:6:"create";s:5:"level";s:5:"sysop";s:6:"expiry";s:14:"20260321230000";}}}'
b'20240105181624' b'05' b'01' b'2024' b'Fran\xc3\xa7ois_Billot_de_Lochner' b'protect' b'WMFOffice' b'a:3:{s:14:"4::description";s:26:"\xe2\x80\x8e[create=sysop] (infini)";s:14:"5:bool:cascade";b:0;s:7:"details";a:1:{i:0;a:3:{s:4:"type";s:6:"create";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";}}}'
labswiki
2
b'20231209022707' b'09' b'12' b'2023' b'Country_protection_list' b'protect' b'Neil Shah-Quinn (WMF)' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:2134283;}'
b'20210826223419' b'26' b'08' b'2021' b'SRE/SRE_Clinic_Duty/Access_requests' b'protect' b'Neil P. Quinn-WMF' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:1923387;}'
outreachwiki
3
b'20200403154633' b'03' b'04' b'2020' b'Education' b'modify' b'SPatnaik (WMF)' b'a:4:{s:14:"4::description";s:72:"\xe2\x80\x8e[edit=autoconfirmed] (indefinite)\xe2\x80\x8e[move=autoconfirmed] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:13:"autoconfirmed";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:13:"autoconfirmed";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:215689;}'
b'20191010155211' b'10' b'10' b'2019' b'Education/Greenhouse/Online_Course_Badges' b'protect' b'SPatnaik (WMF)' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:201954;}'
b'20190122174727' b'22' b'01' b'2019' b'Education' b'protect' b'SPatnaik (WMF)' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:189042;}'
testcommonswiki
1
b'20190107183928' b'07' b'01' b'2019' b'Main_Page' b'protect' b'Jdforrester (WMF)' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:27;}'
wikimania2017wiki
1
b'20170613212618' b'13' b'06' b'2017' b'Registration' b'protect' b'EYoung (WMF)' b'a:4:{s:14:"4::description";s:56:"\xe2\x80\x8e[edit=sysop] (indefinite)\xe2\x80\x8e[move=sysop] (indefinite)";s:14:"5:bool:cascade";b:0;s:7:"details";a:2:{i:0;a:4:{s:4:"type";s:4:"edit";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}i:1;a:4:{s:4:"type";s:4:"move";s:5:"level";s:5:"sysop";s:6:"expiry";s:8:"infinity";s:7:"cascade";b:0;}}s:17:"associated_rev_id";i:31128;}'
zhwiki
1
b'20180227193027' b'27' b'02' b'2018' b'AddisWang' b'protect' b'WMFOffice' b'a:3:{s:14:"4::description";s:37:"\xe2\x80\x8e[create=autoconfirmed] (\xe6\x97\xa0\xe9\x99\x90\xe6\x9c\x9f)";s:14:"5:bool:cascade";b:0;s:7:"details";a:1:{i:0;a:3:{s:4:"type";s:6:"create";s:5:"level";s:13:"autoconfirmed";s:6:"expiry";s:8:"infinity";}}}'

The only new office action we didn't already know about found this way is the autoconfirmed protection of zh:AddisWang. * Pppery * it has begun... 19:36, 21 October 2024 (UTC)

Thanks a lot! FYI, @Bri:: The ANI action indeed seems to be a very rare case. TrangaBellam (talk) 19:41, 21 October 2024 (UTC)

Uncategorized redirects

Looking for a quarry query that can evaluate how many main space redirects do not current have redirect categories added to them. Hey man im josh (talk) 15:12, 15 October 2024 (UTC)

It's likely to take a long time to run. Do you want a list or just a number? —Cryptic 17:15, 15 October 2024 (UTC)
I do understand it'd likely take a while, so for now just a number. If it's not much more work, I think the number by year could also be helpful. It's for use in a discussion regarding edit filter 1,298. Hey man im josh (talk) 17:18, 15 October 2024 (UTC)
There's 6265917 total (counts by year of first edit). Sample of 10000. —Cryptic 17:55, 15 October 2024 (UTC)
Awesome, thank you so much! Hey man im josh (talk) 17:56, 15 October 2024 (UTC)

More redirect queries

As discussed with Cryptic on his talk page, I'm listing some words/cats for the queries to filter with.
Query 2: Redirects with these words in title: Journal, journal, (Journal), (journal)
Query 3: Redirects to articles in these Categories: Academic publishing companies, Non-profit academic publishers, Open access publishers Nobody (talk) 06:46, 16 October 2024 (UTC)

Journal/journal, in those categories. (A version of the latter letting you sort on the category, but containing multiple rows for each redirect, one for each matched category the target's in.) —Cryptic 20:11, 16 October 2024 (UTC)
@Cryptic Would it be possible to search for redirects to articles in specific WikiProjects? Like WP:CH or WP:AJ for example. Nobody (talk) 12:42, 25 October 2024 (UTC)
If their talk pages are categorized or transclude a wikiproject template. —Cryptic 12:55, 25 October 2024 (UTC)
Could you make me a query for redirects like Chateau Chillion, that have no rcats and redirect to an article with {{WikiProject Switzerland}} on the talk page? Thanks Nobody (talk) 13:01, 25 October 2024 (UTC)
quarry:query/87427 has the first ten thousand. There's 23765 total (of 52629 mainspace redirects targeting such pages). —Cryptic 13:44, 25 October 2024 (UTC)