Jump to content

User talk:BrownHairedGirl/Archive/Archive 072

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
click here to leave a new
message for BrownHairedGirl
Archives
BrownHairedGirl's archives
BrownHairedGirl's Archive
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on my current talk page

A citation barnstar for you!

[edit]
The Citation Barnstar
Much thanx for your tireless work feeding Citation bot articles to fix! :) Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 15:49, 20 August 2022 (UTC)[reply]
Thanks, Whoop! BrownHairedGirl (talk) • (contribs) 15:57, 20 August 2022 (UTC)[reply]
PS @Whoop: I should have made it clear that 99.9% of the credit for the current flood of Citation bot edits is due to Citation bot's wonderful maintainer @AManWithNoPlan.
AManWithNoPlan made three changes to allow all this to happen:
  • About 2 months ago, AManWithNoPlan massively expanded the capacity of Citation bot. Previously it had been limited to only 4 channels; now it has over 40. That allows the bot to handle many many more big jobs, and it still always has capacity to promptly handle individual page requests. The technical barriers to doing this were high, but AManWithNoPlan persevered and made it happen. I hate to think how much of his time it must have taken.
  • About a week ago, he created a special "missing title" only mode for Citation bot, which massively reduces the workload of trying to fill bare URLs and missing titles, 'cos instead of its normal mode of checking every ref, it skips those which do have a title. On some articles, such as big lists or copiously referenced articles (e.g. as Boris Johnson, which currently has 948 refs), this reduces the bot's workload by over 99%.
    And to add to the goodness, this missing-titles-only mode has a hugely increased timeout, to help with sluggish websites where the bot previously got no reply. That has allowed the bot fill thousands of title where it previously failed: in the last week, it has filled about 10% of the remaining bare URLs.
  • A few days ago, AManWithNoPlan enabled the retrieval of correct titles to replace the |title=Archived copy added by @InternetArchiveBot.
    Without the previous changes, this one would have swamped the bot as it worked to fill the backlog of over 200K pages with |title=Archived copy; everything else would have been squeezed out. But now the bot can chomp through that backlog while still letting normal work continue. About 50,00 pages have already been removed from Category:CS1 maint: archived copy as title, as the correct titles have been added. Genius.
All I have done is to make a few lists of articles with |title=Archived copy, from Category:CS1 maint: archived copy as title, and feed them to the bot twice a day. Easy-peasy.
So ... if you have any more barnstars in your stash, AManWithNoPlan would be a very worthy recipient. BrownHairedGirl (talk) • (contribs) 08:28, 21 August 2022 (UTC)[reply]

On page Category:1800 crimes in Great Britain, template {{YYYY crimes in countryname category header}} produces an error at the time of writing:

Template:YYYY crimes in countryname category header cannot find the adjectival form of the noun "Great Britain".
For more info and suggested remedies, see Template:GetAdjectiveFromCountryName#Country name.

The section link Template:GetAdjectiveFromCountryName#Country name suggests that section "Country name" is supposed to exist in template's documentation page Template:GetAdjectiveFromCountryName/doc. Should the documentation exist or is the error message incorrect? —⁠andrybak (talk) 22:57, 21 August 2022 (UTC)[reply]

@Andrybak: this template system has never been used that far back, so this issue has only just arisen.
AFAICR, this system relies on a bidirectionally unique relationship between country and demonym. But that poses a problem before 1801, 'cos "Great Britain" shares a demonym with "United Kingdom".
It is after midnight here beside in Ireland, and I am too tired now to devise a workaround ... but I will sort it tomorrow. BrownHairedGirl (talk) • (contribs) 23:13, 21 August 2022 (UTC)[reply]

Citation bot on Steele dossier

[edit]

I saw this type of improvement and would like you to do that for the Steele dossier article. Thanks. -- Valjean (talk) (PING me) 05:15, 21 August 2022 (UTC)[reply]

Hi @Valjean
I tried, but has been banned from Steele dossier.
I checked why that is, and found that the ban was added in Nov 2020, in this edit[1] by ... @Valjean.
That followed this revert[2] by Valjean of an edit by the bot which correctly changed some parameters in the cite templates. The bot was correct to use the |work= for BBC and for Associated Press. In particular, Valjean was wrong to restore |agency=Associated Press for https://apnews.com/d4d838725aa34441b5ae018c31e609a4 , because in that case Associated Press is publishing the article itself. The |agency= parameter is for when an article appears in another work (e.g. a newspaper), and is credited to a news agency rather than to a named author.
Since the ban was mistaken, I will remove the {{bots|deny=Citation bot}}. BrownHairedGirl (talk) • (contribs) 06:52, 21 August 2022 (UTC)[reply]
Removed[3]. BrownHairedGirl (talk) • (contribs) 06:56, 21 August 2022 (UTC)[reply]
OK, @Valjean: Citation bot has made this edit[4] to Steele dossier.
I have reviewed the edit, which consists mostly of changes to template type and to parameter names, e.g.
  • changing |website=[[The Washington Post]] to |newspaper=[[The Washington Post]]
  • changing {{cite web}} to {{cite news}} for those citations.
Note that this type of change by Citation bot was endorsed at a recent RFC: Help talk:Citation Style 1#RfC:_Should_Citation_bot_use_cite_web,_or_cite_magazine,_or_cite_news?, which was closed on 31 July.
If you disagree with bot's edit, please do not revert it again, and do not ban the bot from the article. Instead, let's discuss any disagreement at User talk:Citation bot. BrownHairedGirl (talk) • (contribs) 07:37, 21 August 2022 (UTC)[reply]
I agree, and also about the current changes. Please run the bot again for these sources:
Newspapers: The New York Times, The Guardian, The Daily Telegraph, The Baltimore Sun, The Independent, The Wall Street Journal, The Hill (newspaper), Australian Financial Review, Tampa Bay Times, USA Today, The Arizona Republic, Daily Mirror, The Sunday Times, The Sydney Morning Herald, The New Zealand Herald, Los Angeles Times, Financial Times, Haaretz, The Fresno Bee, Miami Herald, The Scotsman,
Newsmagazines: Der Spiegel, The Week, Newsweek, Cayman Compass,
Magazines: New York (magazine), The Atlantic, Variety (magazine), Mother Jones (magazine), Vanity Fair (magazine), Washington Monthly, Forbes, The Spectator, National Review, Paste (magazine), The Nation, Slate (magazine), Columbia Journalism Review, Rolling Stone, Vice (magazine), Time (magazine), Foreign Policy, Fortune (magazine)
News agencies: Bloomberg News, Reason (magazine),
Thanks so much. -- Valjean (talk) (PING me) 17:07, 21 August 2022 (UTC)[reply]
@Valjean: it seems that you misunderstand how @Citation bot works.
All that (or any other editor) I can control is to ask the bot to process an article. Or alternatively, not ask it.
Once the bot has been invoked, I have no control over what it does the page: it just does its standard, well-polished stuff.
So I won't be running it again on that article. There would be no point, 'cos it has just done its stuff. BrownHairedGirl (talk) • (contribs) 17:19, 21 August 2022 (UTC)[reply]
Okay, I'll have to figure this out. Thanks. I was under the impression one could ask it to do a certain job. -- Valjean (talk) (PING me) 03:11, 22 August 2022 (UTC)[reply]

Citation bot using control characters in titles.

[edit]

You may not know about this situation.

I have noticed several examples of the "Citation bot" replacing "Archived copy" titles with one that contain control characters, Example Hornady (I could not copy-n-paste the "Difference" URL). The Category:CS1 errors: invisible characters was recently empty; it is currently over 60 pages and increasing as I type.

I assume replacing "Archived copy" titles is going well. And these errors are unavoidable. User-duck (talk) 23:40, 23 August 2022 (UTC)[reply]

@User-duck: is this the diff[5]?
I am vaguely aware of this, 'cos @AManWithNoPlan mentioned it to me above.
However, I am not concerned about it, 'cos if I can't see a problem then I can't see a problem.
If you want to raise this, the place to do so is User talk:Citation bot. BrownHairedGirl (talk) • (contribs) 23:46, 23 August 2022 (UTC)[reply]
Less than one-tenth of a percent of the archive titles lead to this, and almost always the new title is better than "archived copy". AManWithNoPlan (talk) 01:13, 24 August 2022 (UTC)[reply]

Zotero run done and Archives are much improved

[edit]

I am now watching the Unicode junk category, since occasionally the archives have them. AManWithNoPlan (talk) 18:12, 22 August 2022 (UTC)[reply]

@AManWithNoPlan: sorry, but I am not following you.
I dunno what the Unicode junk category is.
And my random sample test above have about a 63% rate of removal-from-Category:CS1 maint: archived copy as title on first pass, which suggest that the reside after a complete first passs would be about 74K ... but the count is still 105,628.
I have been working my way through Category:CS1 maint: archived copy as title in batches working backwards from the end of alpha-sorted category lists. And the list I am feeding to the bot are still getting a lot of hits.
It's wonderful progress so far -- the category size is halved, so at least 100K refs have been filled in only a few days, which is fantastic -- but we seem to be some way from even a complete first pass. BrownHairedGirl (talk) • (contribs) 17:56, 23 August 2022 (UTC)[reply]
https://wiki.riteme.site/wiki/Category:CS1_errors:_invisible_characters AManWithNoPlan (talk) 18:11, 23 August 2022 (UTC)[reply]
Thanks, @AManWithNoPlan. To be honest, I don't worry about them. If they are invisible, wossdaproblem? BrownHairedGirl (talk) • (contribs) 18:15, 23 August 2022 (UTC)[reply]
Sometimes they are total rubbish. On a positive note, they grab more attention that "archived copy" AManWithNoPlan (talk) 18:24, 23 August 2022 (UTC)[reply]
More attention is good, @AManWithNoPlan. Sometimes I wonder if me and thee may be the only ppl on the planet worrying about |title=Archived copy. BrownHairedGirl (talk) • (contribs) 18:29, 23 August 2022 (UTC)[reply]
Sometimes it is https://wiki.riteme.site/w/index.php?title=Isabel_Lestapier_Winqvist&diff=prev&oldid=1106188703 where the bot made things 98% better. And sometimes it is https://wiki.riteme.site/w/index.php?title=Faisal_Mekdad&diff=next&oldid=1106219015 where the archive is not good (in this case the archive had the wrong character set specified. By watching that category, I have been able to improve the bot a bunch. AManWithNoPlan (talk) 20:10, 23 August 2022 (UTC)[reply]
  • Update: @AManWithNoPlan, I have now finished feeding the bot with all the articles in Category:CS1 maint: archived copy as title. My last batches finished this morning, leaving 74,891 pages in the category. That tallies almost exactly with my estimate yesterday of about 74K.
    I have taken a rando sample of ~8% the remainder and fed them to the bot in two batches, of 3333 and 3334 articles (for ease of tracking: just search the contribs page for /333).
    I am expecting only trivial progress from this run, possibly no progress. But we'll see. My initial check of 4 edits found [6] which followed a recent edit by IAbot, and three which appeared to have been missed on the first pass: [7], [8], [9]. That's with only ~300 pages processed, and I stopped after finding those 4 edits, so maybe the final hit rate will be non-trivial. It my be useful to do another full pass.
    I manually checked a few articles from the first page of Category:CS1 maint: archived copy as title, and as expected they were nearly all PDFs; some with a .PDF extension, some not. The only non-PDF example in my tiny sample was a large set of refs in 31st Quebec Legislature, where the bot failed to find a title.
    I will try to get around to running a Perl script to count the PDF refs. --BrownHairedGirl (talk) • (contribs) 13:25, 24 August 2022 (UTC)[reply]
A lot of the remaining non-PDF ones have dubious titles, and they get rejected. I am now just running over all 75K interactively, and it asks me 'is this good?' AManWithNoPlan (talk) 15:25, 24 August 2022 (UTC)[reply]
Interactively, that's great, @AManWithNoPlan.
A huge job, but great work. BrownHairedGirl (talk) • (contribs) 15:27, 24 August 2022 (UTC)[reply]
@AManWithNoPlan: I just checked the bot's progress on my sample of remaining Archived copies. These CB contribs show 286 edits while processing 4,540 pages. That's a 6.3% edit rate, which is definitely useful.
So when this pair of batches has finished, I think I should process the other ~67K. Whaddaya reckon? BrownHairedGirl (talk) • (contribs) 17:30, 24 August 2022 (UTC)[reply]
Let me run it with the super stripted down version that only does archives. AManWithNoPlan (talk) 18:15, 24 August 2022 (UTC)[reply]
OK, @AManWithNoPlan. That minimalist version sounds like the best tool for the job. BrownHairedGirl (talk) • (contribs) 18:20, 24 August 2022 (UTC)[reply]
Turns out that several thousand of them have script-title set, so the bot now just deletes the 'archived copy' title for those. AManWithNoPlan (talk) 19:01, 24 August 2022 (UTC)[reply]
Excellent, @AManWithNoPlan! I love when a big chunk of the set can simply be eliminated like that. BrownHairedGirl (talk) • (contribs) 19:06, 24 August 2022 (UTC)[reply]

Bare Ref PDF/DOC/JPEG filling

[edit]

I am building a tool to fill in links for PDF references. The traditional workflow is to manually access the page, find the URL in the source wikitext, paste it into the browser to get the title, then type or copy the title and paste it into the source wikitext.

My new tool handles everything but the title automatically; it integrates with AWB, so when I edit an article, a window appears that displays the PDF file and has a field to fill in the title (or skip it if it is a dead link or in a different language). This means that filling out the PDF ref is much faster than before, and will ease the clearing of the backlog. Last time I tried, the rate of filling 5-10 bare PDFs a minute with slow internet. This is compared to manual filling which was about 1 bare PDF every 1-1.5 minutes.

Example diff of the script: https://wiki.riteme.site/w/index.php?title=April_6&diff=prev&oldid=1106024502

This may be applicable to other formats such as DOC or JPEG, but I have only tested it on PDF files.

Since I don't think I can handle all PDF references myself, I was wondering if it would be possible to temporarily set up a site where people could volunteer to fill in the bare PDF titles and then use resulting data in AWB runs. Thoughts? Rlink2 (talk) 21:57, 22 August 2022 (UTC)[reply]

@Rlink2: thanks for msg and news of your good work.
But before I reply to that, an overdue apology. I have messed you around for over a month by not replying to emails regarding BareRefBot, and you gentle reminders have been commendably polite when some exasperation would have been justifiable. The reason is simply that my regular wikitasks have been intensive, and in the last 6 weeks I have been under a lot of stress off-wiki due to business troubles which have no obvious remedy and are creating a lot of drama as we all tear our hair out. One of the ways I deal with stress is by narrowing my focus, which isn't a great response, but it's a habit with me. Sorry; I will try to get back to BarRefBot.
Now, back to this idea.
First off, I love that you are tackling this {{Bare URL PDF}} backlog. Category:Articles with PDF format bare URLs for citations currently has 36,379 articles, which is roughly 1/3 of all articles with bare URLs, and more than half of those tagged. New ones appear at a scary rate of 50–100 per day, and he tally of bare URL PDFs is not falling, in stark contrast to the steady fall in HTML-format bare URLs. Other non-HTML formats seem similarly short of progress.
So this idea of tool-assisted manual filling has to be the way to go. Yes, we need humans to identify the title; but finding and adding them entirely manually involves a horribly clunky workflow, as you have rightly identified.
Ideally we would have a wee button beside each bare URL ref, which popped up a window with a form to fill in and below that a scrolling zoomable panel of the linked page. But creating that is way beyond my programming skill, and maybe beyond what is possible with the Wikimedia interface.
So basing this tool on WP:AWB sounds like a good idea: much more feasible, and I am sure that you have once again done a great job with the programming.
However, it seems to me that the downside of using AWB is that it significantly narrows the user base. WP:AutoWikiBrowser/CheckPageJSON currently lists only 1,836 users (including bots) ... out of 37.0986 squazilion registered editors, and an unlimited number of IP editors. For those who are not technically-minded, AWB can be very daunting: I know several experienced editors who tried AWB, but abandoned it because they couldn't get their head around how to drive it with the precision needed to avoid errors.
Maybe I am being too pessimistic (an old vice of mine!), and maybe some of those 1,836 AWB users will volunteer; but I am pessimistic, because so few editors have climbed the much lower barriers to using WP:ReFill and WP:Reflinks.
That may have been partly what you had in mind with the idea of an external site where ppl could fill in titles. I am not sure about the technical aspects of that; Wikia may be a possibility, and MediaWiki is free to use if you have a suitable server. So I guess the technical issues are surmountable, but it's over 15 years since I ran a website (a major, groundbreaking one), and the environment has changed a lot since then. But my best guess is that for someone of your high talents and great -perseverance, it shouldn't be too hard.
But but but ... isn't there a problem with verifiability? AIUI, you would be adding to en.wp articles a set of ref titles which you had not verified yourself, and which were supplied to you people who may not be entirely anonymous, but whose ID is not liked to any Wikipedia account. Isn't that a huge vulnerability?
Sorry this is not more encouraging, but I hope it helps somehow. BrownHairedGirl (talk) • (contribs) 20:04, 23 August 2022 (UTC)[reply]
PS An afterthought.
How about analysing the bare URL PDFs, and listing them by declining number of uses for each URL?
My guess it that there is at least some clustering, and that with some cunning programming, one manual lookup could be used by a tool to fill ref on multiple pages.
I have used this approach a few times myself, when I stumble across the annoying case of an editor adding the same bare URL to multiple articles. BrownHairedGirl (talk) • (contribs) 20:20, 23 August 2022 (UTC)[reply]
The reason is simply that my regular wikitasks have been intensive, and in the last 6 weeks I have been under a lot of stress off-wiki due to business troubles which have no obvious remedy and are creating a lot of drama as we all tear our hair out. No worries, it happens. I hope you are able to resolve it.
Also believe it or not, some (like 10-25%) of the PDFs do actually come with a title in the metadata. Example URL is here, If we can create a tool to extract this title and fill those ones quickly, it would be a great start.
Ideally we would have a wee button beside each bare URL ref, which popped up a window with a form to fill in and below that a scrolling zoomable panel of the linked page. But creating that is way beyond my programming skill, and maybe beyond what is possible with the Wikimedia interface. I
But but but ... isn't there a problem with verifiability? I actually thought of this. Something with Toolforge could work, where people login like they do with IABot or something. And the edits go under their account. Or its a request system where everyone has a speical password for the special site so people will know which title was filled by who.
How about analysing the bare URL PDFs, and listing them by declining number of uses for each URL? Yes, but I think the refs that are shared are about 3-4 pages. Rlink2 (talk) 11:05, 26 August 2022 (UTC)[reply]
@Rlink2: be very wary of those titles in the metadata. I did some trials with them, and found a low correlation rate with the actual displayed title. WP:Reflinks does grab them, which I was conscious of when you created {{Bare URL PDF}}, and it was one of my reasons for hesitancy about that tag ... but when I did my sample-checking, I decided that it is such potluck as to whether the titles are any use that it is better to tag and thereby block Reflinks from doing that.
I like your idea of a tool hosted on Toolforge. Way beyond my skillset, but if you could create it, that would be brilliant. All you would then need to do is to persuade enough ppl to use it, which might be hardest part.
I am not so pessimistic about the clustering of PDF refs: I think that there are some cases with much wider use. I have a similar analysis toolset which can be easily modified for this task, will do some analysis over the weekend.
That similar tool was actually sparked by our discussion here. I got it to grab all the untagged bare URLs, and count the hits for each domain in descending order (glory be to Perl hashes!). I have been working down from the top of that list, and it's very productive: the tally for my search for all untagged bare URLs fell below 50K earlier today. That is way beyond the stretch goals I had in mind last autumn. BrownHairedGirl (talk) • (contribs) 11:23, 26 August 2022 (UTC)[reply]

Alerts

[edit]

Hi there BHG, I see that you have been keeping busy this summer. You great work is causing my alerts to light up like a Christmas tree. I know that I did a silly thing by adding <!--ACTUAL ARTICLE TITLE BELONGS HERE! --> to many articles, but the message you intend has been received loud and clear. I would really appreciate it if you stopped flagging me each time you put it right, because I would risk not seeing alerts for other things that need my attention. Regards, -- Ohc revolution of our times 18:19, 13 August 2022 (UTC)[reply]

Hi @Ohconfucius, AFAICR I have not done any removals of those comments in the last week or two, so I dunno why you are still getting alerts. (I just checked: the last such edit was on 17 July).
I included the pings solely so that you could verify my reverts.
But really, wouldn't it have been much better for you to clean up your own error? You were very nice about it when I raised the issue, and you promptly stopped, but it really should not be left to me fix your errors.
And from my side it doesn't feel great for you to return from your holidays and object to my courtesy pings as I cleaned up your mistake. Please could you fix the remaining 53 articles? 31 of them just have the comment <!--ACTUAL ARTICLE TITLE BELONGS HERE! -->; the others contain a title which should be restored. BrownHairedGirl (talk) • (contribs) 18:41, 13 August 2022 (UTC)[reply]
Hi BHG, Haha, I probably got the same surprise as you, but on a much larger scale – to come back from my holidays to find a huge number of alerts in my in-box, which may explain my reaction. Your 187 corrections, complete with pings, were all done with amazing efficiency on the same day! You said that I was very nice when you raised the issue of my silly mistake, which to me implies you find my message (above) the opposite of very nice. Of course I didn't mean to leave it to you alone to fix my mistakes. I adjusted my script to reverse the mistake and I have yet to run in on an article with the error. Let's not forget Wikipedia is a collaborative project, and Rome wasn't built in a day. Anyway, thanks again for correcting my errors. And also, no offense taken or intended on my part. Keep up the good work, and best regards, -- Ohc revolution of our times 23:36, 15 August 2022 (UTC)[reply]
@Ohconfucius: thanks again for your kind words.
However, the fact remains that your script mangled over 200 articles before you fixed the script. I have undone most of them, but there are still 53 pages to be fixed. Please fix them! BrownHairedGirl (talk) • (contribs) 23:41, 15 August 2022 (UTC)[reply]
@Ohconfucius: I came back to check on progress, but found one still unfixed, so I fixed it.[10] BrownHairedGirl (talk) • (contribs) 20:49, 26 August 2022 (UTC)[reply]
[edit]

An automated process has detected that when you recently edited Ed Sheeran, you added a link pointing to the disambiguation page Classic FM.

(Opt-out instructions.) --DPL bot (talk) 09:19, 27 August 2022 (UTC)[reply]

Fixeded[11] BrownHairedGirl (talk) • (contribs) 11:17, 27 August 2022 (UTC)[reply]

Re: Your tag on Wallace's tarsier

[edit]

You tagged this because User:Citation_bot was unable to do anything with the URL. However, another citation in the article was properly formed and used this same URL. Would it be possible to modify the bot so that, within a run, it remembers all the URLs, attempts to replace bare URLs if it finds another copy in a citation (perhaps with a note to the user to double check, or a flag for the user of some sort), and if there are any bare URLs left unmatched, put the Bare URL tag at the top of the article? Just pondering possibilities... Thanks! - UtherSRG (talk) 18:37, 29 August 2022 (UTC)[reply]

@UtherSRG: thanks for your work[12] cleaning up that article. It's good to have the ref sorted.
As to @Citation bot's capabilities, proposals should be made at User talk:Citation bot. It is not my bot, and not my decision.
However, I personally think it is unlikely that your suggestions would be implemented:
  • Merging duplicate citations sounds straightforward, but in practice there are a lot of edge cases. After a year of working full-time on bare URLs, I have seen enough permutations that I wouldn't trust a bot to handle them all gracefully. And with the very high volume of edits which Citation bot makes, its level of controversial edits needs to be a tiny fraction of a percent.
    Citation bot has made 225,000 edits so far this month (see http://en.wikiscan.org/user/Citation_bot), so if even 0.01% of its edits are controversial, that's a drama fest 20 times a month.
  • Adding or removing the {{Cleanup bare URLs}} banner should be straightforward and simple, but sadly it isn't. An aggressive bully on BAG has some offbeat ideas about that, and has created some ugly drama about it, abusing his position on BAG to block even the simplest actions. So I add and remove the banners as an AWB job, without a bot flag, which was the cause of some recent drama at WP:VPM.
    Citation bot's maintainer would be well-advised to stay clear of the ugly dramas created about those banners.
Hope that helps. BrownHairedGirl (talk) • (contribs) 20:27, 29 August 2022 (UTC)[reply]
Ah! Thanks, I should have known both economies of scales and politics would be involved. :D Cheers! - UtherSRG (talk) 20:54, 29 August 2022 (UTC)[reply]
You're welcome, @UtherSRG. And yeah, tat's the way it goes.
Two tools may help if you tackle more bare URLs:
BrownHairedGirl (talk) • (contribs) 21:43, 29 August 2022 (UTC)[reply]

Eugene Chigot citation bot

[edit]

Hi Brown haired girl, There was virtually no information on Eugene Chigot written in English so I spent many hours translating from French relevant sources. I managed to get an article accepted onto Wikipedia. and the article has been praised which is gratifying. The depressing banner that appears on the article refers to a use of a citation bot. I have now read the information and tried to apply said bot but when I'm not certain that I have understood the process and I may need help. I have reviewed some of the links and replaced where appropriate. Please could you let me know if I have followed the correct procedures. I don't want to be a Wikipedia recidivist. Dorkinglad (talk) 11:05, 30 August 2022 (UTC)[reply]

(watching:) Dorkinglad, the tag is about "bare ulr", meaning: just a url without a title. I see that most references in the article look good, and it concerns only #1, #23, #24. Please fix this little issue and remove the tag. BHG, I don't know how you detect the bare urls, - if you have a chance at seeing what it is, and see that it's from IMSLP, it might be easier to just write the proper template than a tag. I fixed some in articles just on my watchlist, finding a large tag for such a minor thing annoying. Normally, users use proper referencing, but it happens that a newcomer adds a line with a bare url, and nobody notices. --Gerda Arendt (talk) 11:34, 30 August 2022 (UTC)[reply]
@Gerda Arendt, it was tagged semi-automatically with AWB (summary:"Add banner {{Cleanup bare URLs}}. After at least 7 passes by @Citation bot since 20220821, this article still has 1 untagged bare URL ref"). See WP:BAREURLS for why these are bad. — Qwerfjkltalk 11:49, 30 August 2022 (UTC)[reply]
@Gerda Arendt: I detect these bare URLs using regex scans of a database dump. I then make huge lists which I feed to Citation bot, and I check progress using a similar regex in AWB. The residue now gets the {{Cleanup bare URLs}} banner.
Filling in the cite template takes me at least 100 times longer than adding the {{Cleanup bare URLs}} banner BrownHairedGirl (talk) • (contribs) 11:53, 30 August 2022 (UTC)[reply]
Understand. So I'll keep following manually. Fixing an IMSLP link takes me less than a minute, and an article looks so much more trustworthy without a banner on top, caused often by a single overlooked instance of negligence. --Gerda Arendt (talk) 12:30, 30 August 2022 (UTC)[reply]
@Gerda Arendt: I use inline tags where possible. However, in most cases they prevent WP:Reflinks from filling a ref, so I use them only where I know that Refinks cannot fill the ref.
And yes, negligent edits do make an article untrustworthy, so it's good to fix them.
Great too that you fill them in under a minute. I tag 20 articles per a minute, over over 1200 per hour, so filling them needs more people.
BTW, I just did a search for bare URL refs to imslp.org. Only one, as of now. BrownHairedGirl (talk) • (contribs) 12:39, 30 August 2022 (UTC)[reply]

Thank you to those that have replied so swiftly to my request for help. I will endeavour to understand the procedure for Cleanup. It would certainly have helped with the article for Chigot and others that I have completed. I am in the process of reviewing the link in the article ~1 has been changed as has two others. I will look at 23 and 24 forthwith. Thank you for your help--Dorkinglad (talk) 12:32, 30 August 2022 (UTC)[reply]

@Dorkinglad: for advice on how to fill refs, see WP:HOWTOCITE and {{cite web}}. BrownHairedGirl (talk) • (contribs) 12:41, 30 August 2022 (UTC)[reply]
Dorkinglad, I didn't see this and just added a title to the 3 refs in question. You can do further work for improvements: add a date to citations, or at least a date when retrieved for web sources, and if in a foreign language, then that as well. When an author is known, that's valuable information, especially when the author has an article. Consider using {{cite web}} instead of writing refs manually. --Gerda Arendt (talk) 12:45, 30 August 2022 (UTC)[reply]
@Dorkinglad, I would put this a lot more strongly than @Gerda did.
My request is: please please please please please please please please please please please please please please please please please please please ALWAYS use a citation template, and fill it out as completely as possible.
Proper referencing is not a niggle or a tweak or an embellishment or an adornment. On the contrary, they are the most important part of any edit, because well-formed citations are absolutely crucial to Wikipedia's core content policy: WP:Verifiability. BrownHairedGirl (talk) • (contribs) 12:54, 30 August 2022 (UTC)[reply]
About times: Each edit took about 3 minutes, having to find the title, editing, preview, adjust, publish. For IMSLP however, the title is in the url, much easier and no preview necessary. I fixed two today, but if only one is left right now, no reason to think about a bot converting. --Gerda Arendt (talk) 12:51, 30 August 2022 (UTC)[reply]
I looked at three more on my watchlist. For two of them, referencing was fine except one bare ulr. In one case, Kaspar (play), I was helpless, because the bare url is in Farsi and I have no way telling what the title is and if it makes sense. --Gerda Arendt (talk) 13:16, 30 August 2022 (UTC)[reply]
@Gerda Arendt: that illustrates one of the many reasons why a ref is best filled by the editor who adds it.
I AGF that editor who added that Farsi-language ref is competent in that language, and knew what they were adding. If so, they would have been able to fill it and add the |trans-title=, whereas most other editors will struggle.
I just filled it[13] using Google translate, as I have done with many hundreds of other non-English refs ... but I have no ability to verify the translation. BrownHairedGirl (talk) • (contribs) 13:49, 30 August 2022 (UTC)[reply]

Fixing dead URLs

[edit]

Hi, I know you spend a lot of effort fixing URLs. You might be interested in User:FABLEBot/New URLs for permanently dead external links. — Qwerfjkltalk 11:51, 30 August 2022 (UTC)[reply]

Thanks @Qwerfjkl. I saw a notice of it on a bots page, where I was responding to an outburst of Headbomb's campaign to disrupt my work. :(
It's an interesting idea, which I think will have some use in fixing links to sites which have changed their structure in a systematic way, and which have retained the old content without deploying proper redirects.
However, I think that even if it was extended to currently untagged bare URLs, it would make only a small dent in the tally of remaining bare URLs, cos most of the dead ones are a difft form of dead. BrownHairedGirl (talk) • (contribs) 12:00, 30 August 2022 (UTC)[reply]
I saw a notice of it on a bots page that was me, posting at BOTN. — Qwerfjkltalk 12:11, 30 August 2022 (UTC)[reply]
Ah, sorry @Qwerfjkl. I had forgotten who posted it.
I do think that this idea has potential, so I hope you pursue it. BrownHairedGirl (talk) • (contribs) 12:20, 30 August 2022 (UTC)[reply]
Given that there are only 105 cases to review, and I've already done 20, I could probably do the rest by myself, but it'd be nice to have someone else look at these. — Qwerfjkltalk 13:52, 30 August 2022 (UTC)[reply]
In spirit, I'd like to help you, @Qwerfjkl. But I am already working 15 hour days on the bare URLs, and after over 13 month of work on it I am trying to keep focused on that task 'cos I want to wrap up that backlog before the end of the year at the latest.
Right now, I have a few productive seams to mine before the next database dump on Thursday. BrownHairedGirl (talk) • (contribs) 13:56, 30 August 2022 (UTC)[reply]

bare URL banners

[edit]

Is it really necessary to put the bare URL banner at the top of articles? They would be just as effective down below, and less disruptive.

Article-topping banners are distracting visually, especially on mobile devices, and more importantly are off-putting to new users of wikipedia: "What does this mean? I don't know but I guess I won't trust this article"

Since this banner does not point to possible errors in the text that readers should be aware of, merely to a markup issue that might affect sources, they don't need to be placed so high up, IMHO. - DavidWBrooks (talk) 14:29, 30 August 2022 (UTC)[reply]

@DavidWBrooks: the location of the {{Cleanup bare URLs}} banner was recently discussed at VPM, and opinion was divided on where to place the {{Cleanup bare URLs}} banner: some editors agreed with your view, but more do not. But since many crucial tools (e.g. WP:TWINKLE, WP:AWB) are set up to handle banners at the top of the page, I will continue to do that unless and until there is consensus to place it somewhere else.
I strongly disagree with your assertion that this is merely to a markup issue. WP:Verifiability is a core policy, and poor citations impede verification in the short term, and way wreck it in the medium-to-long term through WP:LINKROT. See WP:Bare URLs.
And of course, as with any cleanup banner, the best remedy is to fix the problem and remove the banner. BrownHairedGirl (talk) • (contribs) 14:44, 30 August 2022 (UTC)[reply]
Thanks for the response. I still think this placement is an example of how those of us who've been on wikipedia for a very long time underestimate the confusion caused by article-topping banners to most users. But so be it! - DavidWBrooks (talk) 18:11, 30 August 2022 (UTC)[reply]
Maybe, @DavidWBrooks. But I think we will have to disagree on that.
In my view, it is very unfortunate that some parts of Wikipedia culture think that bare minimum citations are acceptable contributions to an encyclopedia. They are very uninformative, and prone to rot.
The community as a whole should be much more proactive about teaching editors how to construct proper citations, so that any puzzlement about this issue is temporary.
And more broadly, all cleanup tags and banners are part of Wikipedia being a work-progress, and part of being transparent about unresolved issues. That openness is sadly missing from most media, and while I can see how it may be challenging, it is something that new editors should be encouraged to embrace and welcome. BrownHairedGirl (talk) • (contribs) 18:20, 30 August 2022 (UTC)[reply]

Women in Red in September 2022

[edit]
Women in Red September 2022, Vol 8, Issue 9, Nos 214, 217, 240, 241


Online events:


Request for help:


Other ways to participate:

Facebook | Instagram | Pinterest | Twitter

--Lajmmoore (talk) 15:34, 31 August 2022 (UTC) via MassMessaging[reply]

The Signpost: 31 August 2022

[edit]

Administrators' newsletter – September 2022

[edit]

News and updates for administrators from the past month (August 2022).

Guideline and policy news

  • A discussion is open to define a process by which Vector 2022 can be made the default for all users.
  • An RfC is open to gain consensus on whether Fox News is reliable for science and politics.

Technical news

Arbitration

  • An arbitration case regarding Conduct in deletion-related editing has been closed. The Arbitration Committee passed a remedy as part of the final decision to create a request for comment (RfC) on how to handle mass nominations at Articles for Deletion (AfD).
  • The arbitration case request Jonathunder has been automatically closed after a 6 month suspension of the case.

Miscellaneous

  • The new pages patrol (NPP) team has prepared an appeal to the Wikimedia Foundation (WMF) for assistance with addressing Page Curation bugs and requested features. You are encouraged to read the open letter before it is sent, and if you support it, consider signing it. It is not a discussion, just a signature will suffice.
  • Voting for candidates for the Wikimedia Board of Trustees is open until 6 September.

Leatherface Edit

[edit]

Hello @BrownHairedGirl: Saw your edit on the Leatherface article, and want to thank you for your contribution. Just so you know those issues, among the vast amount of others in the article, are currently being fixed and completed in a separate userspace of mine. Been working on it on and off for several years now and slowly but surely reaching completion so it can all be transferred into the main article. Just wanted to let you know this so there is no confusion or anything. Anyways best of luck with editing!! Paleface Jack (talk) 19:22, 1 September 2022 (UTC)[reply]

Thanks, @Paleface Jack. BrownHairedGirl (talk) • (contribs) 19:25, 1 September 2022 (UTC)[reply]
Np. Feel free to look that draft over if you like or offer suggestions. Paleface Jack (talk) 19:28, 1 September 2022 (UTC)[reply]

Zotero only

[edit]

Point me to a page of pages to run and I will beta test my new code. AManWithNoPlan (talk) 13:21, 14 August 2022 (UTC)[reply]

@AManWithNoPlan: I have a list of ~40,000 pages which is almost ready to roll. How many pages do you want? BrownHairedGirl (talk) • (contribs) 13:26, 14 August 2022 (UTC)[reply]
Let's start with 10K. AManWithNoPlan (talk) 13:29, 14 August 2022 (UTC)[reply]
OK. Gimme 5 minutes to get it ready. BrownHairedGirl (talk) • (contribs) 13:30, 14 August 2022 (UTC)[reply]
should it skip urls that have titles but no dates? by default we look for dates? AManWithNoPlan (talk) 13:37, 14 August 2022 (UTC)[reply]
@AManWithNoPlan: the list is ready, at User:BrownHairedGirl/sandbox941.
The whole list was 39,525 pages, and this 10K is a random selection of that set, alpha-sorted.
I am not sure what you mean by "skip urls that have titles but no dates".
Is that about
  1. bare URLs where the bot can get a title but not a date?
  2. Partially-filled URLs where there is a title but no date? BrownHairedGirl (talk) • (contribs) 13:45, 14 August 2022 (UTC)[reply]
Should it skip already partially filled in URLs. AManWithNoPlan (talk) 13:49, 14 August 2022 (UTC)[reply]
@AManWithNoPlan: yes please, skip partially-filled -- at least for now.
I would like this job to concentrate on fixing the completely bare, rather than on improving existing templates. BrownHairedGirl (talk) • (contribs) 13:52, 14 August 2022 (UTC)[reply]

@AManWithNoPlan: it's looking good so far, with 200 pages processed in 20 minutes. That's about 600/hour, which is probably 4 or 5 times as fast as usual.

This is a very useful speed-up, but is less less than I had expected. Is it definitely doing only completely bare URLs? --BrownHairedGirl (talk) • (contribs) 14:33, 14 August 2022 (UTC)[reply]

The still bare ones are the ones that often dont resolve etc. Which are the slow ones. AManWithNoPlan (talk) 19:13, 14 August 2022 (UTC)[reply]
I have learned a bunch. Done some fixes and I am restarting from the top. AManWithNoPlan (talk) 21:04, 14 August 2022 (UTC)[reply]
Many many thanks for all your work on this, @AManWithNoPlan. Even this first pass has been very productive, increasing throughput by several hundred percent. If something like this eventually goes live, it will make for a big speedup in my work and a big load reduction on Citation bot's resources.
Now that I think about it, I know what you mean about the non-resolves being slow. Back in February & March, I did a series of huge trawls through what was then a set of about 250K articles with bare URLs. First round tagging the URLs that gave a clear HTTP 404 or 410 error, and then I went through those that I had timed out. I used a VPN to spoof my location in 10 difft cities dotted around the globe, each time keeping only the URLs that timed out. Even with ten lookup jobs running simultaneously, waiting for each timeout took forever: the whole job took about ten days of 24/7, but it did allow me to tag about 80K bare URLs as dead.
One of my current manual tasks is working through a set of ~350 bare gov.in URLs, which are a horrible mix of slow responses and timeouts. Progress is glacially slow. BrownHairedGirl (talk) • (contribs) 21:31, 14 August 2022 (UTC)[reply]

How is this going, @AManWithNoPlan? The bot was making great progress through the 10,000 articles, with a significant fill rate, which is great.

But I see that the last edit was over 3 hours ago: this one[14] at 12:32 UTC, #6342/10000. Is it paused for more code development? --BrownHairedGirl (talk) • (contribs) 18:09, 15 August 2022 (UTC)[reply]

I did have to stop it. It is now going again. AManWithNoPlan (talk) 19:31, 15 August 2022 (UTC)[reply]
That's great. Thanks again, @AManWithNoPlan.
Do you reckon it's getting near a stable version? BrownHairedGirl (talk) • (contribs) 19:34, 15 August 2022 (UTC)[reply]
I think this is stable. This is a "last resort" mode, in that it waits for 45 seconds on Zotero before giving up. On the other hand, it gives up on DOIs really fast. AManWithNoPlan (talk) 21:43, 15 August 2022 (UTC)[reply]
Hey, @AManWithNoPlan, that's great. A 45-sec timeout sounds ideal for this task: it's exactly what I settled on when I was doing my mass wait-for-timeout runs when I was looking for URLs that were effectively dead.
Have you had any thoughts about making some way in which I could access this mode without bugging you? BrownHairedGirl (talk) • (contribs) 21:56, 15 August 2022 (UTC)[reply]
Most errors are like this:
  !Operation timed out after 45001 milliseconds with 0 bytes received   For URL: http://www.defence.gov.au/defencenews/stories/2013/aug/0806.htm
  >Could not resolve URL http://www.burnslakeband.ca/#!/business/elections/2016/by-election/
  >Retrieved info from http://eletmod50.com/a-misinatol-a-tubesig-a-teljes-pecsi-panorama/   (note that this then did nothing, meaning that the title was empty, but "found")
  >Received invalid title data for URL https://www.tabroom.com/index/tourn/results/round_results.mhtml?tourn_id=8965&round_id=330192: Tabroom.com. (note that title is just website).
  !Did not get a title for URL http://www.tarnetar.com/history.htm: {"type":"https://mediawiki.org/wiki/HyperSwitch/errors/unknown_error","method":"get","uri":"/wiki.riteme.site/v1/data/citation/zotero/http%3A%2F%2Fwww.tarnetar.com%2Fhistory.htm"}
  >Received invalid title data for URL http://www.hikespeak.com/trails/tri-peaks/: Tri Peaks Trail | Malibu | Hikespeak.com  (this is a bit picky, since it sees the website and ignores the title.  This is because these titles are usually junk).

AManWithNoPlan (talk) 22:08, 15 August 2022 (UTC)[reply]

If you do a linked page run and do not select slow mode, and the page name/path includes "ZOTERO" in it, then is should do this mode. https://github.com/ms609/citation-bot/commit/ce67de308382d61d2702e66cab10d8a92cc7b9b1 AManWithNoPlan (talk) 22:14, 15 August 2022 (UTC)[reply]
Thanks, @AManWithNoPlan. Is there a size limit?
I will try that, but linked page runs have not worked for me since the big expansion in capacity a month or two back. I will try this and report back. BrownHairedGirl (talk) • (contribs) 22:25, 15 August 2022 (UTC)[reply]
@AManWithNoPlan: I tried it with User:BrownHairedGirl/no ZOTERO - Remainder from 20220801 - Group B not otherwise processed - part 1
As I feared, the bot interface responded with !No links to expand found
Same response from multiple browsers.
When we discussed this before, you said that the same page worked you. I dunno what the problem is with my requests, but that is why gave up using Linked pages.
So if I am going to be able to use the wonderful new mode with you have created, I need to be able to access it using pipe-separated list format. BrownHairedGirl (talk) • (contribs) 01:16, 16 August 2022 (UTC)[reply]
Did you accidentally enter "https://wiki.riteme.site/wiki/User:BrownHairedGirl/no_ZOTERO_-_Remainder_from_20220801_-_Group_B_not_otherwise_processed_-_part_1" instead of "User:BrownHairedGirl/no_ZOTERO_-_Remainder_from_20220801_-_Group_B_not_otherwise_processed_-_part_1" on the webform? AManWithNoPlan (talk) 01:26, 16 August 2022 (UTC)[reply]
@AManWithNoPlan: no, definitely not the URL. Just the name as copied from the heading: User:BrownHairedGirl/no ZOTERO - Remainder from 20220801 - Group B not otherwise processed - part 1
This is exactly what I did every one of the many hundred times I used the bot for nearly a year, until it stopped working when the bot was upgraded. BrownHairedGirl (talk) • (contribs) 01:41, 16 August 2022 (UTC)[reply]
PS When linked pages started failing for me, I tried everything. Multiple browsers, different list length, removing from the list page all text and links other than the list. I logged in and out multiple times, and I tried 3 browsers on my other PC. but still no go. All still gave me the same response: !No links to expand found
It's a pity, because I liked the transparency of using linked lists, where I can publicly document how the list was made and any interested editor can see what I am doing. But when a fix was not forthcoming (for the very understandable reason that you could not replicate the problem!), I just used piped lists as a workaround. BrownHairedGirl (talk) • (contribs) 02:01, 16 August 2022 (UTC)[reply]

I hate saying this but "I works for me". So, please try again. I have added some debug logging. AManWithNoPlan (talk) 11:34, 16 August 2022 (UTC)[reply]

@AManWithNoPlan: whatever you have done, that seemed to work on first attempt, just after 11:42 UTC.
I will see if the job shows up in the contribs. It has 3701 pages, which should be a unique number.
And yes, it is working: see e.g. [15]
Supercalifragilisticexpialidocious! Thank you thank you. BrownHairedGirl (talk) • (contribs) 11:46, 16 August 2022 (UTC)[reply]
@AManWithNoPlan: it seems that I spoke too soon.
The bot crashed a few hours ago, and all jobs were dropped. Now linked list doesn't work again. BrownHairedGirl (talk) • (contribs) 19:25, 16 August 2022 (UTC)[reply]
Interesting that the Bot got rebooted. Nothing in the logs other than the bot starting up. Looks like it was done by the toolforge masters. Can you enter the page name again on the webform, unselect the the "Thorough mode" button and then click on the "Process Pages Linked From" button and let me know what the error is. I will watch the logs. AManWithNoPlan (talk) 20:03, 16 August 2022 (UTC)[reply]
I have removed the need to uncheck the box with better code order. Also, the page API now accepts ZOTERO_ONLY as your first page in a pipe separated list as the magic flag. AManWithNoPlan (talk) 20:11, 16 August 2022 (UTC)[reply]
@AManWithNoPlan: Thanks again for your work on this.
I have just submitted a piped list with 2574 entries, including the "ZOTERO_ONLY" first item.
And with that I am right out of batches, until the morning. BrownHairedGirl (talk) • (contribs) 20:19, 16 August 2022 (UTC)[reply]
OK, three morning batches of 3,2xx pages each (i.e. 3222, 3425, 3429), all submitted as piped lists with ZOTERO_ONLY.
And the bot is chomping through them as if it had afterburners. This is really great work, @AManWithNoPlan! I get my jobs done faster, and the bot's resources get less hammered.
To get a glimpse, see the bot's edits just now, and do a Ctl-F in-page search for /32 BrownHairedGirl (talk) • (contribs) 08:19, 17 August 2022 (UTC)[reply]

Zotero-only goodness

[edit]

@AManWithNoPlan: I have now had about 12 hours of the Zotero-only mode running on multiple lists. And I have another piece of learning: that longer timeout makes it way more thorough.

I have a list of "list of foo" articles which I have been running through the bot. I started out with 1,022 articles, and in the first 4 passes the bot filled only 2 refs. Out of frustration, I manually filled a few. So for the fifth pass, the list was down to 1,010 pages.

Right now the contribs list shows that bot has processed 181/1010 pages, and filled 8 of them. If it sustains that average, that will be 45 out of the full set, on a batch where most of the articles had already been processed several time recently.

Within ten days, I will have run the entire set of remaining bare URLs through Zotero-only mode. It will make a big dent in the backlog.

But one question: will this mode work on refs where there is a Citation template with no title? e.g. {{cite web |url=http://example.com/fubar}}.

There are 29,000 of them in Category:CS1 errors: bare URL, and my previous attempts to feed them to the bot have made only trivial progress. But with the longer timeout, zotero-only mode should do a more thorough job, if it includes them. --BrownHairedGirl (talk) • (contribs) 16:26, 17 August 2022 (UTC)[reply]

great news. yes, a template with no title will run. The special code just says "why yes, this citation is prefect as it is" when a title is present. an empty title param does not count. AManWithNoPlan (talk) 18:14, 17 August 2022 (UTC)[reply]
Thanks, @AManWithNoPlan. That's good to know. It's a usefully more sophisticated approach than my crude notions that this would be a strict bare-URLs-only mode, catching only something like <ref[^>]*?>\s*https?:[^>< \|\[\]]+\s*(\{\{bare *url *inline[^\}]*\}\} *)?<\s*/\s*ref
Would you be able to feed Category:CS1 errors: bare URL to the bot in zotero-only mode via your command line tool? Or should I chop it into 8 chunks and feed it through the web interface? BrownHairedGirl (talk) • (contribs) 18:24, 17 August 2022 (UTC)[reply]
I can run that. AManWithNoPlan (talk) 18:58, 17 August 2022 (UTC)[reply]
Many thanks, @AManWithNoPlan. BrownHairedGirl (talk) • (contribs) 19:02, 17 August 2022 (UTC)[reply]
Check this out: https://wiki.riteme.site/w/index.php?title=%3AHello_World%21_%28composition%29&diff=prev&oldid=1104962414 I noticed by watching these jobs that Web Archive code was case-sensitive. Fixed that. AManWithNoPlan (talk) 19:51, 17 August 2022 (UTC)[reply]
@AManWithNoPlan: I read the diff and followed the link, but didn't spot the casing issue. Maybe I am daft, or just tired. Can you clarify? BrownHairedGirl (talk) • (contribs) 20:02, 17 August 2022 (UTC)[reply]
The bot source code was case specific and was not adding these links. It was incorrectly ignoring a lot of them. AManWithNoPlan (talk) 20:07, 17 August 2022 (UTC)[reply]
Ah, good catch, @AManWithNoPlan.
BTW, are you aware of the wee glitch that not all Webarchive URLs are of the form https://web.archive.org/web/YYYYMMDDmmssss/http (e.g. in that case https://web.archive.org/web/20111012020852/http)
Some of them have extra characters after the timestamp.
I have AWB modules to extract original URLs from archived URLs, and I wondered for ages why hundreds were being skipped. So I checked the residue of a batch, and found those extra characters in a variety of forms, but mostly AFAICR _ws after the timestamp.
So now my matcher in C# is string findCiteUrlWebArchiveOrgCiteRegex = @"(?<allref>(?<preurl><ref\b[^>]*>\s*\{\{\s*(citation|citeweb|cite ([ a-z]+))(?=\s*\|)[^\}\{]*\|\s*url\s*= *)(?<archiveurl>https?://web\.archive\.org/web/(?<dateyy>[12]\d\d\d)(?<datemm>\d\d)(?<datedd>\d\d)\d{4,}[^ /<>\}\{]*/(?<origurl>https?://(?<website>[^/]+)[^ \]\[\{\}\|<>]+))(?<afterturl>[^\}\<\>\}\{]*)(?<closeref>\}\}\s*<\s*/\s*ref\s*>))";
Yeah, that's a fugly regex which could and should be simplified, but it works. My point is to note the bit that I have bolded here, to catch the random crap: https?://web\.archive\.org/web/(?<dateyy>[12]\d\d\d)(?<datemm>\d\d)(?<datedd>\d\d)\d{4,}[^ /<>\}\{]*/(?<origurl>https?:// BrownHairedGirl (talk) • (contribs) 20:37, 17 August 2022 (UTC)[reply]
The Bot code handles that case already. AManWithNoPlan (talk) 20:46, 17 August 2022 (UTC)[reply]
That's great, @AManWithNoPlan. I guessed you would probably be ahead way of me, but thought it best to mention it just in case. BrownHairedGirl (talk) • (contribs) 20:51, 17 August 2022 (UTC)[reply]
@AManWithNoPlan: the zotero-only bot is doing great work on Category:CS1 errors: bare URL. So far it has done processed 114 pages, and added titles to 32 of those. Still only a small and non-random sample, but if the overall fill rate is anywhere near that 28% rate on the first pass, it will be feckin' brilliant.
Previously I was getting less than 1%. BrownHairedGirl (talk) • (contribs) 20:49, 17 August 2022 (UTC)[reply]
@AManWithNoPlan: Now 1357 pages done, which is a decent sample.
284 edits so far, which is a 21% edit rate. Still good.
But the page count of Category:CS1 errors: bare URL has only fallen from 29,264 to 29,237, which is a drop of only 27.
It seems that many of pages have multiple missing titles, and the bot isn't often catching them all. It will probably be worth doing a few passes, and then examining a sample of the remainder in more detail. BrownHairedGirl (talk) • (contribs) 23:29, 17 August 2022 (UTC)[reply]

@AManWithNoPlan: the bot is now chomping its zotero-only path through the full set of almost 11K articles which transcluded {{Bare URL inline}}, and it is getting a great fill rate. Your work on making this happen has been really valuable, and I hugely appreciate it. Thank you! --BrownHairedGirl (talk) • (contribs) 21:51, 18 August 2022 (UTC)[reply]

@AManWithNoPlan: I see that the bot is now filing refs where a cite template has an InternetArchiveBot-style |title=Archived copy ... and that it is doing it in zotero-only mode, so it is chomping through those that it finds in my current big batch of ~20,000 articles with untaggedbare URLs. Which is lots of them.
This is brilliant! Thank you!
I look fwd to a big dent in Category:CS1 maint: archived copy as title. As of now it has 201,049 pages; let's what happens to this transcluded live counter: 53,323 BrownHairedGirl (talk) • (contribs) 16:39, 19 August 2022 (UTC)[reply]
The Archives have a better hit rate than Zotero, since the URLs there usually actually work. If you find one that does not work, and is not an archived 404 and is not a PDF, let me know. AManWithNoPlan (talk) 23:20, 19 August 2022 (UTC)[reply]
@AManWithNoPlan: the progress is great, but the flood of edits I see does not seem to be reflected in the slower fall in the page count of Category:CS1 maint: archived copy as title.
That seems to me to suggest that there is a non-trivial residue of unfillable |title=Archived copy refs, and that the bot is filling only some of the refs on each page.
However, as you say, the success rate should be high. Maybe I am overestimating the number of edits so far.
But also, even a near-100% rate of downloading the archived copy doesn't guarantee success; some of the archived copies will be non-HTML formats, and some will just be crap webpages with no evident title. And of course others will be archived 404s.
When the bot has completed a set, it will be interesting to analyse what remains and see what has actually happened.
But whatever about the remainder, it is great to see so many titles being filled. This is a long-standing problem, and your remedy is brilliant. BrownHairedGirl (talk) • (contribs) 23:35, 19 August 2022 (UTC)[reply]
Once you are done with all the "archived copy" pages, let me know. I will run a version that asks "are you sure this bad title is bad" and that will allow me accept some bad titles. The bot is pickier than a human. AManWithNoPlan (talk) 01:14, 24 August 2022 (UTC)[reply]

Randomised trial of "title=Archived copy" filling

[edit]

@AManWithNoPlan, my curiosity got the better of me again. So I have started a randomised trial of the success rate of @Citation bot's brilliant new functionality of adding a real title to replace |title=Archived copy.

I used shuf to make a randomised list of all the pages in Category:CS1 maint: archived copy as title, and from that I took 6 sets of just over 2,000 pages each. (They show up in Citation bot's contribs list as batch sizes /2002, /2005, and in increments of 3 to /2016) That's a total of a little over 12,000 pages, which is a wee bit more than a 6% sample, so I think it should be statistically valid.

The six batches should be complete in about 4 hours (if the current rate of progress is sustained). I will then grab an up-to-date listing of Category:CS1 maint: archived copy as title and compare my work list against that, to see what remains. If you want copies of these list, just say so, and I can email them.

If the size of the residue list is non-trivial, I will shuf it to allow randomised checking of the residue. I will also run it through a quick-and-dirty AWB Custom Modules to test what percentage of the residue has a clearly non-HTML file extension (PDF, JPG, PNG, XLS, DOC etc).

If the manual sampling finds cases of archived error pages such HTTP 404, 410, 3XX, 4XX, 5XX etc, I will try to rustle up a wee Perl script to identify them. (Do you have any code fragments that might help me to do that?). I am pretty sure that there will be at least some of those, which raises the question of what to do with them. So calling @GreenC, who is the expert on archive.org: is there any existing procedure for flagging such dead archives? In all these cases, they will be within an existing cite template, so I am thinking that there probably should be a parameter along mthe lines of |archive-dead=yes, but I have not found anything close to that in {{cite web}}. I guess that you ay already be tackling this issue, so do you have any thoughts on how to handle these cases? --BrownHairedGirl (talk) • (contribs) 10:50, 20 August 2022 (UTC)[reply]

  • Initial results. Citation bot has finished processing all 12,051 articles in my randomised set.
Of those 12,051:
So that's 63.5% of the articles no longer have |title=Archived copy.
Excellent work, @AManWithNoPlan. I will try to start analysing the remainder later today. --BrownHairedGirl (talk) • (contribs) 14:49, 20 August 2022 (UTC)[reply]
If you find archive.org URLs that are soft-404s (determined via the title string) make a list and send to me. They'll need to be removed from the IABot database which will effect 300+ wikis. Include the article title if possible that way I can also process via WaybackMedic which will hunt for alternative providers and remove/replace the existing archive url. If it can't find an alternative it will remove the archive URL and add a {{dead link}}. -- GreenC 15:38, 20 August 2022 (UTC)[reply]
Thanks, @GreenC. Will do! BrownHairedGirl (talk) • (contribs) 15:39, 20 August 2022 (UTC)[reply]
Now doing manual run. This will take days (months???) since I have to approve all the edits. About one-fourth of the titles are bad, so I cannot lower the bar. AManWithNoPlan (talk) 23:41, 28 August 2022 (UTC)[reply]
@AManWithNoPlan: manual run on ~70K pages is a massive task.
Wouldn't it be better to ask for volunteers to share the load? BrownHairedGirl (talk) • (contribs) 23:44, 28 August 2022 (UTC)[reply]
It requires that the person have the bot access keys. AManWithNoPlan (talk) 23:06, 2 September 2022 (UTC)[reply]
Ah, I see.
But still, @AManWithNoPlan, I wish there was some way of sharing the load. You work v hard on other aspects of the bot, and this is a herculean addition to your workload. BrownHairedGirl (talk) • (contribs) 23:39, 2 September 2022 (UTC)[reply]

Precious anniversary

[edit]
Precious
Nine years!

--Gerda Arendt (talk) 07:38, 3 September 2022 (UTC)[reply]

Popcorn moment, #728590304516

[edit]

I am enjoying the absurdity of WP:Articles for deletion/Death of Mikhail Gorbachev.

The massively-reported, widely analysed death of one of the most notable people of the 20th century is not notable, according to some editors.

Obviously, the fact that Gorbachev was not from an English speaking capitalist country is in no way influencing the flurry of wikilawyering by inexperienced editors. BrownHairedGirl (talk) • (contribs) 02:42, 4 September 2022 (UTC)[reply]

Thanks for your insightful comment [16]. I haven't considered the usefulness of WP:ROUTINE before and the real intentions behind it. You're right that so much stuff on Wikipedia is routine. Maybe because people don't stop and ask if this event/ person/ thing is really worthy of note? Fans want their team/ company / person / thing article. As an aside, now I can't stop yawning! ---Steve Quinn (talk) 04:05, 4 September 2022 (UTC)[reply]
Thanks, @Steve. And sorry about the yawn plague!
My examples of yawns were of course all sarcastic. Just about anything can be dismissed as "routine" if you want to label it that way.
Not another American mass shooting? Routine! Yawn.
Fuxake, yet another moon landing? Routine! Yawn.
My favourite example of the misuse of WP:ROUTINE was in WP:Articles for deletion/Saoirse McHugh, where a bunch of editors insisted that despite copious evidence of mountains of coverage, and evidence that it was way beyond what other candidates got, it was all WP:ROUTINE because it was linked to elections.
This was all so utterly bonkers and counter-factual that it stretched assumptions of good faith to breaking point, but the closer bought it all. It was two years before the article Saoirse McHugh was re-created.
If only McHugh had taken a few minutes out of her political career to play a ball game, she'd never have been deleted. BrownHairedGirl (talk) • (contribs) 04:33, 4 September 2022 (UTC)[reply]
Yes, if she had played a ball game she probably would not have been deleted . Good one! ---Steve Quinn (talk) 18:22, 4 September 2022 (UTC)[reply]

Board of Trustees election

[edit]

Thank you for supporting the NPP initiative to improve WMF support of the Page Curation tools. Another way you can help is by voting in the Board of Trustees election. The next Board composition might be giving attention to software development. The election closes on 6 September at 23:59 UTC. View candidate statement videos and Vote Here. MB 03:18, 5 September 2022 (UTC)[reply]

Bare URL tagging

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Such edits [17] block automatic tools from fixing bare urls and thus are not helpful. Cheers. Materialscientist (talk) 02:48, 6 September 2022 (UTC)[reply]

@Materialscientist: you are wrong, as I will explain below. And also uncivil.
It is a great pity that you show no sign whatsoever of having read the edit summary of the diff[18] which you posted: {{Bare URL inline}} refs to sites where WP:REFLINKS won't get title. See User:BrownHairedGirl/No-reflinks websites
Both @Citation bot and WP:ReFill support {{Bare URL inline}}. The only tool I am aware of which does not support the {{Bare URL inline}} tag is WP:Reflinks.
That's why I have put a massive amount of work over the last 10 months into building a huge (and growing) list of regexes to allow inline tagging of URLs where that does not impede automatic tools.
I also put a time into crafting an edit summary which explains as much as possible of this within the short limits which AWB allows for edit summaries. And I put a lot of time into writing the documentation at User:BrownHairedGirl/No-reflinks websites.
So ... do you know what is really not helpful?
The truly not helpful thing is people like you who think it is a good idea to reproach someone else without first reading the info which has been carefully and prominently supplied by the person who they are reproaching. I find that behaviour to be damnably rude. BrownHairedGirl (talk) • (contribs) 07:56, 6 September 2022 (UTC)[reply]
You are clearly overreacting, there is no any incivility in my message. Wikistress building up?
Yes, I didn't check your preparatory work, and yes, I use reflinks; with all its imperfections, it is still the best tool for completing bare urls. Roughly, citation bot is built for journal refs, and refill is inferior to reflinks. Cheers. Materialscientist (talk) 11:11, 6 September 2022 (UTC)[reply]
No, I am not over-reacting, @Materialscientist. And your reply is even more rude than your initial post.
So you did indeed fail to read the edit summary before posting, and you clearly still have not read and comprehended either the edit summary or User:BrownHairedGirl/No-reflinks websites, or even the explanation I posted here.
That is extremely rude. Rudeness is not simply a matter of using certain words: broadly, it consists of displaying contempt for the other person, which is what you have done here. You didn't read my edit summary, you didn't read my response here, and you mock me. That is very nasty and boorish behaviour.
The issue here is very simple:
  1. The only tool which is impeded by {{Bare URL inline}} is WP:Reflinks
  2. WP:Reflinks cannot fill the refs tagged in the diff[19] you posted
  3. the URLs tagged in that diff were tagged solely because they had been identified as not fillable by WP:Reflinks, and therefore included by me in my User:BrownHairedGirl/No-reflinks websites.AWB jon
  4. So your assertion that my edit block automatic tools from fixing bare urls and thus are not helpful remains totally false.
As to Citation bot, and I am by far its heaviest user, and journal refs are very small proportion of what it fills. It does a very good job on most of them. The flaws of WP:ReFill are well-documeted at User:BrownHairedGirl/No-reflinks websites and elsewhere.
But the merits of the tools are not the point here. The issue here is that you falsely accused me of making a not helpful edit, because you chose not to read my edit summary. And now you have compounded that error, twice:
  1. By still not acknowledging your error
  2. By goading and sneering at my objections
I expect that that per WP:ADMINACCT, that an admin will be careful and diligent in their actions, and they will promptly and graciously correct and apologise for their errors.
You have done none of that. Instead:
  • you did not read before accusing me
  • you did not retract when your error was demonstrated
  • you chose to act as a troll and/or a bully, by sneering at me with your snarky comment Wikistress building up?
I cannot know whether this is caused by gross incompetence on your part, by arrogance, or by extreme rudeness. But the cause does not really matter, because after two rounds it is absolutely clear that discussion with you is futile ... and that futility remains, regardless of whether you disgraceful conduct is due to incompetence, rudeness, arrogance, malice towards me, or something else.
So: Enough, Materialscientist. WP:Administrators are expected to observe a high standard of conduct, but your conduct here has been the worst I have seen for some weeks. You have repeatedly lacked the necessary combination of competence, comprehension and civility ... so get the hell off my talk page, and stay off. Do not try to interact with me in any venue for any reason whatsoever.
Goodbye. BrownHairedGirl (talk) • (contribs) 14:18, 6 September 2022 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Proposal for the Monarchism article

[edit]

Hi. I opened a section on the talk page of this article. I am wondering if you could have a look and give me your opinion(s) regarding my proposal. Here is a link to that section on talk [20]. It is entitled "American Caesar" and I have also provided links to a few sources. I never heard of this before until I read about it today. ---Steve Quinn (talk) 02:42, 8 September 2022 (UTC)[reply]

Hi @Steve Quinn!
Sorry, but neither part of it is my territory. I try to avoid the politics of ancient Rome, and also to avoid the fruity nutjobs who would like to dispense with even the very limited layer of democracy which sits on the American Republic.
I suggest that the first question is whether this stance is anything other than a WP:FRINGE issue. That's hard for me to judge, because a lot of stuff which would be beyond the fringe in Europe is mainstream in the USA. BrownHairedGirl (talk) • (contribs) 10:03, 8 September 2022 (UTC)[reply]
I understand that these are topics you try to avoid. No problem. Thanks for taking a look. And it didn't occur to me that this might be a WP:FRINGE topic. I am taking that into consideration. Regards, Steve Quinn (talk) 15:34, 8 September 2022 (UTC)[reply]

Category:Books by publisher has an RFC for possible consensus. A discussion is taking place. If you would like to participate in the discussion, you are invited to add your comments on the discussion page. Thank you. —Wingedserif (talk) 15:27, 9 September 2022 (UTC)[reply]

Question on Citations

[edit]

Hi! I just wanted to ask a quick question regarding citation, specifically the url-access-level parameter. If a news site only starts requiring registration after a certain amount of articles have already been read, would you say that they should be tagged as "requiring registration", or should that be reserved for sites that require registration for all articles? Krisgabwoosh (talk) 21:27, 9 September 2022 (UTC)[reply]

Hi @Krisgabwoosh
To be honest, I had never thought about that. But how about using |url-access=limited? See Template:Cite_web#Subscription_or_registration_required. BrownHairedGirl (talk) • (contribs) 21:34, 9 September 2022 (UTC)[reply]
That's probably the best option, thanks! I'll also consult the template's talk page to see if some consensus can be reached on clarifying the wording. Krisgabwoosh (talk) 21:38, 9 September 2022 (UTC)[reply]

Hello, @BrownHairedGirl! I've read some manual, but I don't really get the idea of bair urls thing. Could you please point at a bad link example and tell me what's wrong with it? I would find a way to fix it then. Thank you. KhinMoTi (talk) 11:34, 29 August 2022 (UTC)[reply]

Hi @KhinMoTi
See WP:Bare URLs and WP:HOWTOCITE. BrownHairedGirl (talk) • (contribs) 11:37, 29 August 2022 (UTC)[reply]
@BrownHairedGirl
Do I get it right?
Is that what I am supposed to do?
https://wiki.riteme.site/w/index.php?title=Lev_Karsavin&type=revision&diff=1107346046&oldid=1106848895 KhinMoTi (talk) 14:25, 29 August 2022 (UTC)[reply]
@KhinMoTi: that looks roughly right.
I am sorry that I can't give you a clear answer, but the text is in a language and script I don't understand, so I cannot assess it properly. BrownHairedGirl (talk) • (contribs) 14:40, 29 August 2022 (UTC)[reply]

Hello again, @BrownHairedGirl!

I did my best to fix all the refs on Lev Karsavin page. Some of them indeed were dead, so I've find other sources for verification and left no bare URL without additional info (titles, dates, etc.). Do you think it's ok for me to remove the "Cleanup bare URLs" tag now? --KhinMoTi (talk) 18:33, 10 September 2022 (UTC)[reply]

@KhinMoTi, that looks good! You must have done a lot of work to get that al sorted out.
There are no remaining bare URLs, so feel free to remove the {{Cleanup bare URLs}} banner.
Alternatively, it will be semi-automatically removed tomorrow, when I run my periodic AWB job to remove redundant banners. BrownHairedGirl (talk) • (contribs) 20:12, 10 September 2022 (UTC)[reply]

A tag has been placed on Category:Saint Lucian male boxers indicating that it is currently empty, and is not a disambiguation category, a category redirect, a featured topics category, under discussion at Categories for discussion, or a project category that by its nature may become empty on occasion. If it remains empty for seven days or more, it may be deleted under section C1 of the criteria for speedy deletion.

If you think this page should not be deleted for this reason you may contest the nomination by visiting the page and clicking the button labelled "Contest this speedy deletion". This will give you the opportunity to explain why you believe the page should not be deleted. Please do not remove the speedy deletion tag from the page yourself. Liz Read! Talk! 01:59, 11 September 2022 (UTC)[reply]

AFD - Study Techniques

[edit]

Wow - an unrhymed coupled in an AFD! Wonderful.  Velella  Velella Talk   23:03, 13 September 2022 (UTC)[reply]

Thanks, @Velella BrownHairedGirl (talk) • (contribs) 23:08, 13 September 2022 (UTC)[reply]
[edit]

Hello! I saw the banner you placed on Barbara Handschu regarding link rot, which I have cleaned up. I wanted to leave a note in case there were any issues, as this is my first time removing one. I am leaving the other banner on the page for now, as I am still looking for more citations to improve the page. Please let me know if there are any issues with that removal, and happy editing! :) StakeStack (talk) 23:47, 22 August 2022 (UTC)[reply]

Hi @StakeStack
Thank you for taking the time to improve the refs. This is unglamorous, but valuable work which support the core Wikipedia policy of WP:Verifiability. I see that you have made only 24 edits, but you seem to be a fast learner
I too a look at your edit[21], and I saw two things:
  1. You moved the NYTimes ref from the "See also" section to an inline ref, and elegantly used it in three places, correctly naming the cite so that it doesn't get multiple entries in the reflist. Nice work.
  2. You deleted the bare URL ref to http://www.dobrishlaw.com/attorneys_Barbara-E-Handschu.php ... which is not so good.
    Yes the link is dead, but per WP:DEADLINK, it should not be removed unless repair has been tried. The simplest step is to tag it with {{Dead link}}, like this: <ref>http://www.dobrishlaw.com/attorneys_Barbara-E-Handschu.php {{Dead link|date=August 2022}}</ref> ... which alerts other editors and bots to try to rescue it.
    Alternatively, you can go a few archive sites such as https://web.archive.org/ or http://archive.today, and look for an archived copy. That's what I did: I found https://web.archive.org/web/20150601005148/http://www.dobrishlaw.com/attorneys_Barbara-E-Handschu.php and I added it in this edit.[22]
Hope this helps. And thanks again for your good work BrownHairedGirl (talk) • (contribs) 00:10, 23 August 2022 (UTC)[reply]
I appreciate your help. I have been traveling and unable to edit recently, but am starting up again and will continue to study Wikipedia policies so I am formatting and tagging things correctly. It seems the "source editor" may be more efficient in completing some of these referencing/tagging tasks - I have a lot to learn! Thanks again :) StakeStack (talk) 11:59, 14 September 2022 (UTC)[reply]

Category:Categories by century and country has been nominated for merging

[edit]

Category:Categories by century and country has been nominated for merging. A discussion is taking place to decide whether this proposal complies with the categorization guidelines. If you would like to participate in the discussion, you are invited to add your comments at the category's entry on the categories for discussion page. Thank you. Privybst (talk) 14:56, 14 September 2022 (UTC)[reply]

Interesting Engineering page

[edit]

Hello,

I'm the editor-in-chief of Interesting Engineering. I notice that on the page about our website, there is a box alerting users that there may be a conflict of interest.

https://wiki.riteme.site/wiki/Interesting_Engineering_(website)

However, I note here that the box can be removed if "the problem is not explained on the article's talk page, and/or if no current attempts to resolve the problem can be found."

https://wiki.riteme.site/wiki/Template:COI#When_to_remove

I note that there is no discussion on any issues and nobody has raised any issues. With that in mind, would it be possible to remove the box now? Otherwise, is there anything I can do to help with its removal?

Best,

Mike MikeBrownIntEng (talk) 17:20, 13 September 2022 (UTC)[reply]

Hi @MikeBrownIntEng
The page Talk:Interesting Engineering (website) does include a note that @Nick.lucchesi has a COI. The issues are explained more at User talk:Nick.lucchesi.
To be bluntly honest, the article as it stands is crap. It is wholly promotional in tone and substance, and the refs provided do not add up to evidence that the website meets WP:GNG. The last two of the article's four sections look like extracts from a pitch to advertisers. The whole thing needs a complete rewrite.
It's all so flimsy that I would have WP:PRODded it if I hadn't seen that the website has a lot of hits. That leads me to think that there may be more WP:SIGCOV available to establish notability. The most useful thing tat you could do would be to identify that WP:SIGCOV, and post links on the article's talk page ... along with a clear declaration of your own COI. See WP:PLAINSIMPLECOI. BrownHairedGirl (talk) • (contribs) 17:40, 13 September 2022 (UTC)[reply]
Okay, thanks for your help on this, I will share thoughts on the talk page! MikeBrownIntEng (talk) 08:59, 14 September 2022 (UTC)[reply]
I just shared some links on the talk page, let me know if those are helpful and if there's anything else that can be done to help here. MikeBrownIntEng (talk) 16:13, 14 September 2022 (UTC)[reply]
Thanks, @MikeBrownIntEng. That looks like the sort of thing that might help.
I have not analysed or even viewed the links you posted, so I can't say whether those particular links help (other than that Youtube rarely helps to establish notability). But links is the best that a COI editor can provide.
However, I have no idea why you chose to approach me. My only edit to the article was an automated technical edit[23] to fix formatting of references. In doing that, I viewed only the changes, and never even read the article until you posted a link here.
I have absolutely no interest whatsoever in the topic, and want no further involvement with it. I was happy to show you what you can do, but that's limit of my involvement ... unless someone chooses to nominate the article for deletion, in which case I think that unless something big had changed, the evaluation I would make would probably lead me to support deletion.
Best wishes, BrownHairedGirl (talk) • (contribs) 16:28, 14 September 2022 (UTC)[reply]
Okay, thanks for your help. I was looking for a Wikipedia editor that may be able to help with the issue, so looked at the version history to see who contributed. I won't bother you any further, and I appreciate your assistance with this. MikeBrownIntEng (talk) 16:32, 14 September 2022 (UTC)[reply]

Category:Novels by country by century has been nominated for renaming

[edit]

Category:Novels by country by century has been nominated for renaming. A discussion is taking place to decide whether this proposal complies with the categorization guidelines. If you would like to participate in the discussion, you are invited to add your comments at the category's entry on the categories for discussion page. Thank you. Laurel Lodged (talk) 07:25, 15 September 2022 (UTC)[reply]

Category:Politics by country by century has been nominated for renaming

[edit]

Category:Politics by country by century has been nominated for renaming. A discussion is taking place to decide whether this proposal complies with the categorization guidelines. If you would like to participate in the discussion, you are invited to add your comments at the category's entry on the categories for discussion page. Thank you. Laurel Lodged (talk) 07:26, 15 September 2022 (UTC)[reply]

Category:Television by country by century has been nominated for renaming

[edit]

Category:Television by country by century has been nominated for renaming. A discussion is taking place to decide whether this proposal complies with the categorization guidelines. If you would like to participate in the discussion, you are invited to add your comments at the category's entry on the categories for discussion page. Thank you. Laurel Lodged (talk) 07:28, 15 September 2022 (UTC)[reply]

Converting bare URL

[edit]

Hello, in a run of Citation bot you have made a large number of changes such as this which fill in the bare URL of https://www.ordnancesurvey.co.uk/business-government/tools-support/open-data-support The problem with this is that just goes to a top level page which does not actually verify anything. I assume that when these were added they should have pointed to some page on the site that gives detail on the subject of the article. May be we should remove them all and replace with {{cn}} tags or add a tag such as {{failed verification}}. Any thoughts? Keith D (talk) 19:33, 16 September 2022 (UTC)[reply]

Hi @Keith D
At first glance, I thought that maybe you were telling me that @Citation bot has screwed up the filling of some soft 404 URLs, by using the title from the redirect target.
But in the case of the diff you posted,[24], @Citation bot seems to have correctly filled the bare URL ref to https://www.ordnancesurvey.co.uk/business-government/tools-support/open-data-support
So the problem here is not in any way with @Citation bot. The problem is that some editor or editors have added a ref which is not just bare, but uselessly vague.
The correct action in such cases would be to add the {{nonspecific}} tag to the current ref or, as you suggest remove these refs and replace them with {{Citation needed}}.
But I got curious as to what had happened here, so I went burrowing. First I searched for insource:/https:\/\/www\.ordnancesurvey\.co\.uk\/business-government\/tools-support\/open-data-support/i, which found 119 hits. But I was curious as to how so many of them had come to be filled just now, so I did a little more burrowing, and found that there was a mass replacement of older URLs by @Citation bot's owner @AManWithNoPlan: see these edits.
I know that AManWithNoPlan is an exceptionally skilled and conscientious editor, so I am sure there was a good reason for this, and possibly some discussion somewhere. Maybe AManWithNoPlan can explain.
Also, this has echoes of a UK geography issue to which @PamD drew my attention a few months back. It was being discussed on some project page, tho I can't recall which. But the same project may have a view on this mess ... and I don't want to go ripping out refs unless there is a clear consensus to do so. So some projectspace discussion is needed. BrownHairedGirl (talk) • (contribs) 21:21, 16 September 2022 (UTC)[reply]
Thanks, I have dropped a note on WikiProject UK geography. Keith D (talk) 22:38, 16 September 2022 (UTC)[reply]
The problem is that this is link to a database. As such, there is no real good link. And, the old link had no good archive. So, I turned "total rubbish" into "kinda rubbishy". AManWithNoPlan (talk) 21:13, 17 September 2022 (UTC)[reply]

Emporis.com has gone, but is preserved

[edit]

Emporis.com, used as a source and/or an external link in thousands of articles on tall buildings, has recently been discontinued. The target pages generally seem to have been preserved in the Internet Archive, though. Can you come up with a script to substitute in those working IA links? BD2412 T 00:16, 18 September 2022 (UTC)[reply]

Hi @BD2412! Hope you are well.
I am pretty sure that this is not a wildly complex task. But I have not done that sort of thing before, so it would take me quite a bit of work to get up to speed.
However, I know two most excellent editors who may be able to help:
  • @GreenC does a lot of work with the Internet Archive, and is a v skilled programmer
  • @Rlink2 does a lot of v skilled and accurate work with archives.
I hope that one of them may be able to help. If not, I suggest a post at WP:BOTREQ. BrownHairedGirl (talk) • (contribs) 01:17, 18 September 2022 (UTC)[reply]
Many thanks, I'm sure one of your proposed avenues will work out. BD2412 T 01:24, 18 September 2022 (UTC)[reply]
User:BD2412, I can do it. -- GreenC 03:15, 18 September 2022 (UTC)[reply]
Great! BD2412 T 03:28, 18 September 2022 (UTC)[reply]
Thread for this request now picked up at Wikipedia:Link_rot/URL_change_requests#Emporis.com_links. -- GreenC 15:13, 20 September 2022 (UTC)[reply]

Sorry

[edit]

Sorry for stepping on your toes, however, please note that my edits predate your posting of the inuse, which you posted while I was still editing.--Auric talk 13:30, 20 September 2022 (UTC)[reply]

Not so, @Auric. See https://wiki.riteme.site/w/index.php?title=Ba%C4%9Fc%C4%B1lar&action=history
My edit of 13:02[25] was the first edit in 8 days, the start of a planned series of cleanup edits. You decided to jump in immediately after my first edit, and created an edit conflict. So before my next edit, I added an {{inuse}} tag, which you chose to ignore. Even if you had not seen the tag when you started editing, you will have been made aware of it when saving your edits.
This has all been a big waste of time. You piled in on my work, multiplying the total editor effort in involved by a factor of at least three, and that is before the time used up on this talk page.
If I had no been disrupted, fixing the mangled refs on this article would have taken about 5 to 10 minutes of my time. But thanks to your disruptive interventions, nearly 40 minutes has passed before I am ready to do the next page.
Please, please ... when someone else is fixing something, don't jump into the middle of their work and start creating edit conflicts. You are a highly experienced editor, so I am surprised that this should need to be said. BrownHairedGirl (talk) • (contribs) 13:41, 20 September 2022 (UTC)[reply]
Again, sorry. I didn't receive any edit conflicts and had no idea you were doing anything. I checked the history before I started, but didn't see any IPs, so I didn't check back. I was focusing on the bottom sections of the article, so I didn't notice any new tag.--Auric talk 13:52, 20 September 2022 (UTC)[reply]
@Auric: I do believe that you mean that apology, and I don't want to be rude. But I still think that this episode could have been avoided, so for future benefit I'll explain why and take the risk of this appearing rude.
If you checked the history before you started, you'd have seen my edit a minute or two before, manually cleaning up some refs. That should have been a good indication of work-in-progress. I don't see what IPs have to do with any of this.
This page is from a list of articles with bare URLs produced by one of my big searches, in this case from the 2020220901 database dump. Most of them are fed to citation bot and/or my User:BrownHairedGirl/No-reflinks websites tagging process, but some I cleanup manually. This is one of a small set where I decided to do manual cleanup.
So I am puzzled that you jumped on this page only about 100 seconds after my edit. That doesn't seem like a coincidence: what's up? BrownHairedGirl (talk) • (contribs) 14:08, 20 September 2022 (UTC)[reply]
Coincidences do happen. The page in question popped up in Category:CS1 errors: unsupported parameter, likely due to the references being in that odd combination of English and Turkish. I usually check the History of pages with odd references like that to make sure it isn't some IP user making an odd edit, as either a new user error or vandalism, like Lili Marleen, which I found and fixed earlier, which can be easily fixed with a simple undo. Since I didn't see any recent IP activity and there was no visible sign on the page of any editor activity, I started fixing references. Then a notification popped up on another website and I stopped editing for a little time, only to come back to your edit. --Auric talk 14:26, 20 September 2022 (UTC)[reply]
@Auric: sorry, Auric but that doesn't tally with the history I see.
My first edit was saved at 13:02.[26] Your first edit was a tiny one,[27] converting a space to a dash. It was saved three minutes later, after only a tiny simple change ... so if you had checked the history, you should have seen my edit. I'm not suggesting any bad faith, just a lack of care.
I could of course reduce the risk of this by being much more prolific with {{inuse}} templates, but in general I try to avoid them on low-edit pages because they take more of my time and clutter up the page history. And in this case, {{inuse}} didn't help. BrownHairedGirl (talk) • (contribs) 14:44, 20 September 2022 (UTC)[reply]
Regardless, that is what happened. I did see your edit, but I didn't attach any importance to it. Lots of users edit an article, making only one change here or there. I suppose editors could wait 5 minutes to see if the last editor is going to make other changes or leave a message on their talk page to see if they are finished editing that page, but that could easily tick someone off, and I try to avoid that.--Auric talk 14:55, 20 September 2022 (UTC)[reply]
@Auric: That's what I thought happened, and why I didn't just drop this.
You saw that I was on the case, fixing refs on a page with a lot of refs to fix, and you decided to pile in and do the same thing at the same time ... and you didn't foresee the inevitable duplication of effort and edit conflicts.
I am sure you acted in good faith, but it was not a great judgement. No need to leave a msg in such cases; just move on to one of the many thousands of articles with stale problems which are not being actively fixed by someone else. That avoids wasting you time and someone else's times.
Best wishes, BrownHairedGirl (talk) • (contribs) 17:57, 20 September 2022 (UTC)[reply]
Please don't assume my actions. I saw no such thing. I saw an article that needed fixing and that wasn't being fixed. I'm sorry to have caused you stress, but I acted in good faith.--Auric talk 19:52, 20 September 2022 (UTC)[reply]
I did not assume. I had a hunch, so I asked nicely, and you confirmed my hunch. BrownHairedGirl (talk) • (contribs) 01:29, 21 September 2022 (UTC)[reply]

Wikipedia entry on Fiona Watt

[edit]

Dear BrownedHairedGirl, I am contacting you as the Wikipedia editor who did the last updates and change to the entry https://wiki.riteme.site/wiki/Fiona_Watt. I am referring to the addition of a whistle-blowing investigation which was added to the entry earlier this year. The addition is factually correct and referenced correctly. Its position however in the entry from our view is should be rearranged, and the corresponding paragraphs be moved to the section 'Career'. I do no want to make that change without having consulted with a Wikipedia editor. For disclosure: Fiona Watt is now heading EMBO (see Embocomm). I'd very much appreciate your feedback. Best regards, Tilmann Kiessling. Embocomm (talk) 09:07, 21 September 2022 (UTC)[reply]

Hi @Embocomm (Tilmann Kiessling)
My edits to the article Fiona Watt were purely technical, to fill bare URLs. I have no substantive interest in the topic.
However, I do see that you have a clear conflict of interest (COI) wrt to Watt, who is Director of the European Molecular Biology Organization. Wikipedia's policy WP:Conflict of interest is very clear that a person should not edit an article where they have a COI. However, as far as I can see all of the edits to Wikipedia by User:Embocomm have been to articles where you have a clear COI.
That is unacceptable. Please desist. I will now post warnings on your talk pages and on the articles which have been edited by the user Embocomm (talk · contribs). I note that the page User:Embocomm (permalink) correctly declares a COI wrt to two articles, but despite this the account has repeatedly been used to edit articles which should not have been touched.
Also, please note that the username "Embocomm" is not acceptable per our policy WP:ISU because it implies that it is a shared account, which is forbidden by our policy WP:Username policy#Shared_accounts. The page User:Embocomm (permalink) confirms that such misuse has indeed taken place: Under this account, EMBO staff members have created and edited Wikipedia chapters related to EMBO and to EMBO-related activities. I will also take action to tackle that problem.
I have to say that I am surprised and disappointed to see that Wikipdia's COI policies have been so persistently breached by staff of a major professional association of scientific researchers. The COI was declared only in 2018,[28] ten years after the account's first edits; yet four years after that declaration, the account has been used today to make edits in pursuits of that COI. This reflect very poorly on your employer. BrownHairedGirl (talk) • (contribs) 12:57, 21 September 2022 (UTC)[reply]
PS I note that on 1 February 2020‎, @Randykitty posted[29] on your talk page a note about your COI, including the clear and prominent request to avoid editing or creating articles about yourself, your family, friends, colleagues, company, organization or competitors.
Yet 2½ years later, you are still editing an article about your boss. BrownHairedGirl (talk) • (contribs) 13:17, 21 September 2022 (UTC)[reply]
Hi BrownHairedGril, thank you getting back right away. I will desist from further edits. Please note that the edits were corrections of grammar or updates of outdated information. In this talk, without having done a substantial change yet, I actively asked for consultation and declared the COI. I am happy to undo the changes (corrections of grammar and factual updates) in the hope that someone from the Wikipedia community will be doing them. Best regards, Tilmann Kiessling Embocomm (talk) 13:38, 21 September 2022 (UTC)[reply]
@Thomas: thanks for the reply, but you seem not to understand the seriousness of the situation.
Over the last 14 years, Embocomm (talk · contribs) has made 158 edits to six articles, some of them very substantial, and in each case there is a clear COI.
Even today, you not only ignored the request 2½ years ago to avoid editing or creating articles about yourself, your family, friends, colleagues, company, organization or competitors; you even defend having done so after the breach is noted.
Continuing to push the line like this after so many years is the sort of conduct which I associate with dodgy startup companies engaged in marketing, not an international scientific association. BrownHairedGirl (talk) • (contribs) 13:57, 21 September 2022 (UTC)[reply]

Embo

[edit]

I think you have typed "15" for "158" in the ANI. PamD 17:36, 21 September 2022 (UTC)[reply]

So I did. Thanks, @PamD for spotting it. Now fixed. BrownHairedGirl (talk) • (contribs) 18:14, 21 September 2022 (UTC)[reply]

GEOnet Names Server citation

[edit]

Hi, I hope you are doing well. I have what I hope is a quick question. I was going through some geography stubs for cities in Iran and noticed a few of them have citations like these: Kelarabad can be found at GEOnet Names Server, at this link, by opening the Advanced Search box, entering "-3068788" in the "Unique Feature Id" form, and clicking on "Search Database".

Personally I don't think this is a decent/reliable citation (given that the reader has to go on a treasure hunt to confirm the information) and should be removed but it is providing data of sorts. Do you think this is an okay citation format? If not I will go through and clear it out but as some of these articles are thin on sources I would rather not add to the unreferenced article list :/ I really appreciate any insight you can offer - thanks! Kazamzam (talk) 00:33, 22 September 2022 (UTC)[reply]

Hi @Kazamzam, and thanks for your message.
However, I won't make any blind comments. So please post some links to articles with this style of citation, so that I can examine them. BrownHairedGirl (talk) • (contribs) 00:38, 22 September 2022 (UTC)[reply]
Sure thing! Fereydunkenar, Rineh, Gazanak, and Kelarabad all have this type of citation. Kazamzam (talk) 00:43, 22 September 2022 (UTC)[reply]
Thanks, @Kazamzam. I took a peek, and ... ouch, that's bad. My view is that:
  1. The directions given on how to find the info amount to a research guide, not a citation. The crucial part of WP:Verifiability is that the info access must be repeatable, but this seems to me to be too fuzzy to be repeatable.
  2. http://geonames.nga.mil/namesgaz/ is dead, so I can't verify how clean or messy the pathway is
  3. The GEOnet Names Server is not a neutral source. The geographical requirements of one country's armed forces are a very partisan approach to the world, and in this case the USA's approach to Iran is so hostile that a US military source is wholly inappropriate.
  4. The GEOnet Names Server has a poor reputation for accuracy.
So my view is that those refs are junk. In my view they should be removed, and replaced with {{citation needed}}.
However, please be aware that my views on quality are sometimes opposed by other editors, and some of them get very angry and aggressive about it ... and when they come at me with pitchforks, it turns out that me identifying quality problems is uncivil and bludgeoning and so on.
So while I have given you my assessment, I won't advise you what to do about it. BrownHairedGirl (talk) • (contribs) 01:38, 22 September 2022 (UTC)[reply]
Much obliged! Kazamzam (talk) 01:45, 22 September 2022 (UTC)[reply]

Retraction request

[edit]

Please retract the reference to WP:NOTHERE you made with Special:Diff/1112299634. I had already apologised for the remark you were objecting to before you made this edit: Special:Diff/1112299036. Even if I hadn't already apologised and explained that I didn't intend the remark in the way that it was read ( I have accept, regardless of my intention, the error in writing it), the call to NOTHERE so quickly goes against WP:UNCIVIL, which says that "In general, be understanding and non-retaliatory in dealing with incivility. If others are uncivil, do not respond the same way."

I really don't want to carry on having any kind of argument on a personal level, as this was never my intention. I have always tried to play the ball, as it were, and not the person. I am writing this message because I would very much like to try and de-escalate our disagreement before I make my next update to the move discussion at heart here.

In the spirit of reciprocity, if I have made any particular statements that haven't been addressed by my previous apology, or that you would like me to retract, please let me know. Best regards, H. Carver (talk) 01:45, 26 September 2022 (UTC)[reply]

Hi @H. Carver
I welcome your efforts to de-escalate, and I have no wish to continue an argument.
However, I stand by my view that ignoring policy at an RM discussion timewasting and disruptive, and that persisting in ignoring policy after being warned is so severely disruptive that it is WP:NOTHERE.
That may sound harsh, but it's because in my long experience on Wikipedia I have seen far too many occasions where vast amounts of time are wasted by failure to adopt (or failure to observe) a common policy or guideline on a recurring issue. Without some agreed set of principles on how to decide such issues, the result would be decisions taken at random by whoever shows up, with multiple inconsistencies in article titles which would baffle readers and impede editors. So before joining any discussion, a good faith, competent editor finds and studies the relevant policies, and seeks to apply them. Sadly, you chose not to do that. You ay that you want to play the ball ... and if that's truly the case, play by the rulebook, which in this case is the policy WP:AT.
I object strongly to efforts by any editor to claim that it is somehow uncivil to note such misconduct and disparage it ... and I consider such claims to be manipulative and bullying attempts to deter editors from upholding policy.
So if you want revise your stance, and base your comments on policy, that would be great. And it would render my complaints moot.
However, I will not retract or revise what I have written. BrownHairedGirl (talk) • (contribs) 02:14, 26 September 2022 (UTC)[reply]
I am disappointed by your reply. I am not intentionally ignoring policy, and to again suggest I am is once more going against WP:UNCIVIL. You have explicitly said now that you are assuming I am acting in bad faith - "Sadly, you chose not to do that" - which is uncalled for, as is accusations of 'manipulative and bullying' behaviour. Before you take that latter argument any further, I think you should look at WP:BLUDGEON and consider how it applies to your responses in the move discussion in question.
If any of the above makes you reconsider your viewpoint, I would be happy to carry on the discussion. But if there is no change in your views, I suggest that per WP:CIVILITY we do no respond to each other any further, and I specifically request that you do not respond to any more comments I make in the move discussion in question. If as you believe my arguments are really that weak, trust the eventual closer to read them that way rather than try and refute me yourself. I wish you good day, H. Carver (talk) 02:29, 26 September 2022 (UTC)[reply]
@H. Carver: I hoped that you would take a step back, and reconsider the disruptive effect of ignoring policy.
However, I thought it more likely that instead of resolving the substantive problem, you would persist in offence-taking.
I wish that my pessimism had been misplaced, but sadly not.
It's still not too late for you to actually read WP:AT. Insofar as you fail to base your arguments in policy and in fact, I reserve te right to reply to you on those points. Please note that you still not identified any article which be made ambiguous by using changing the title "state funeral", so your statement that removing "state" creates ambiguity is simply false.
I have no idea why you are making such a drama out of a blatant falsehood. It is a very very odd thing to make a stand on. Anyway, stay off my talk. I have had enough of your timewasting, and more than enough of your attempts to invoke civility as a shield against challenges to your disruption. BrownHairedGirl (talk) • (contribs) 02:44, 26 September 2022 (UTC)[reply]