User talk:The Earwig/Archive 14
This is an archive of past discussions with User:The Earwig. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 10 | ← | Archive 12 | Archive 13 | Archive 14 | Archive 15 | Archive 16 | → | Archive 18 |
Signpost issue 4 – 29 March 2018
- News and notes: Wiki Conference roundup and new appointments.
- Arbitration report: Ironing out issues in infoboxes; not sure yet about New Jersey; and an administrator who probably wasn't uncivil to a sockpuppet.
- Traffic report: Real sports, real women and an imaginary country: what's on top for Wikipedia readers
- Featured content: Animals, Ships, and Songs
- Technology report: Timeless skin review by Force Radical.
- Special report: ACTRIAL wrap-up.
- Humour: WikiWorld Reruns
Presidents and Vice Presidents of Palau
In my opinion, it would be a good idea to combine the articles President of Palau and Vice President of Palau. Both are short articles. We could redirect Vice Presidents of Palau to President of Palau. I would like your thoughts on this.Векочел (talk) 01:58, 4 April 2018 (UTC)
- Векочел, I don't know enough about the politics or history of Palau to give an informed response. However, the subjects of the two articles are quite distinct and each one could be expanded to have unique content that wouldn't fit in the other article. Looking around, Category:Vice presidents by country is well populated and I don't see any other countries that have chosen to merge them. — Earwig talk 02:23, 4 April 2018 (UTC)
Bot request
Erasing copyvio detector bot is one of the best thing I have seen. Can this bot be used on mrwp? --✝iѵɛɳ२२४०†ลℓк †๏ мэ 07:32, 10 April 2018 (UTC)
- Tiven2240, thank you. You can use the tool anywhere, ideally, though I don't guarantee it handles other languages as well as English. However, I don't run a bot that detects copyvios and removes them automatically. There are too many incorrect identifications (the results are not reliable enough) for this to be a good idea; humans should always have the final say. — Earwig talk 02:05, 11 April 2018 (UTC)
EarwigBot on Template:AFC_statistics
Template:AFC_statistics hasn't been updated since yesterday. Is there something wrong with the bot? -- » Shadowowl | talk 13:33, 18 April 2018 (UTC)
- Shadowowl, there is an issue with lagging databases on Wikimedia Cloud Services; the data is about a day and a half old so the chart can't update. You can check on that here (that page seems to be misbehaving as well, but you can still see the lag). Anyway, it looks to be going down and should resolve itself soon. — Earwig talk 01:57, 19 April 2018 (UTC)
The Signpost: 26 April 2018
- From the editors: The Signpost's presses roll again
- Signpost: Future directions for The Signpost
- In the media: The rise of Wikipedia as a disinformation mop
- In focus: Admin reports board under criticism
- Special report: ACTRIAL results adopted by landslide
- Community view: It's time we look past Women in Red to counter systemic bias
- Discussion report: The future of portals
- Arbitration report: No new cases, and one motion on administrative misconduct
- WikiProject report: WikiProject Military History
- Traffic report: A quiet place to wrestle with the articles of March
- Technology report: Coming soon: Books-to-PDF, interactive maps, rollback confirmation
- Featured content: Featured content selected by the community
Copyvio detector not working
Hello Earwig, I have a problem with the copyvio detector today: It's returning an error "An error occurred while using the search engine (Google Error: HTTP Error 403: Forbidden)." Any help would be appreciated. Thanks! — Diannaa 🍁 (talk) 12:27, 5 April 2018 (UTC)
- Kaldari, do you have any idea? Unfortunately, I'm not seeing anything in the logs that could help diagnose—just the 403 error. — Earwig talk 03:51, 6 April 2018 (UTC)
- It looks like we hit the daily query limit (10,000 queries per day). Any idea why there was such a big spike today? Usually, we only get to about 5,000 queries a day. Kaldari (talk) 04:41, 6 April 2018 (UTC)
- No idea why that would happen. It's working again today. Thanks for looking into this. — Diannaa 🍁 (talk) 10:26, 6 April 2018 (UTC)
- This is likely related to SQLBot's AFC-Ores reports, which are using the tool. — JJMC89 (T·C) 05:14, 7 April 2018 (UTC)
- @JJMC89: Nope, explicitly didn't use the google search functionality (ever), and in the last 24 hours rewrote to cut the amount of api pulls by 85%. SQLQuery me! 05:25, 7 April 2018 (UTC)
- Query levels seem to be back to normal today. Kaldari (talk) 05:36, 7 April 2018 (UTC)
- @SQL: So then your bot is basically just comparing articles against the external links included in the page? How useful is this? — Earwig talk 18:14, 7 April 2018 (UTC)
- Seems to be pretty helpful so far. It would probably be better with google on - for sure, but I was trying to follow the 'Etiquette' section (I also use a sleep() in between queries), and not consume more than my share of finite resources. And, looking at today's high score, Draft:Asli_Demirguc-Kunt - my query shows 90.2% confidence, while bypassing the cache and using google shows 89.6%. I've spot checked a lot of them, and most seem to have a similarly negligible difference. That mainly leaves articles with no links. I'm not 100% sure how I should proceed on those ones yet. SQLQuery me! 01:29, 8 April 2018 (UTC)
- @JJMC89: Nope, explicitly didn't use the google search functionality (ever), and in the last 24 hours rewrote to cut the amount of api pulls by 85%. SQLQuery me! 05:25, 7 April 2018 (UTC)
- This is likely related to SQLBot's AFC-Ores reports, which are using the tool. — JJMC89 (T·C) 05:14, 7 April 2018 (UTC)
- No idea why that would happen. It's working again today. Thanks for looking into this. — Diannaa 🍁 (talk) 10:26, 6 April 2018 (UTC)
- It looks like we hit the daily query limit (10,000 queries per day). Any idea why there was such a big spike today? Usually, we only get to about 5,000 queries a day. Kaldari (talk) 04:41, 6 April 2018 (UTC)
I got the same error again late yesterday (circa 22:00 UTC) and the tool is functioning normally again this morning. Posting as information. @Kaldari: — Diannaa 🍁 (talk) 11:49, 2 May 2018 (UTC)
- Yes, we’ve been discussing this one over at phab:T193559. — Earwig [alt] talk 15:38, 2 May 2018 (UTC)
- @The Earwig and Diannaa: It looks like we're hitting the daily quota every 5 days exactly due to a regularly timed spike. On April 26, May 1, May 6, and May 11, there were huge spikes in Google Search API usage from Tool Forge resulting in hitting the quota and then being denied service for the rest of the day. I'm going to file a Phabricator task to investigate further. Ryan Kaldari (WMF) (talk) 20:11, 11 May 2018 (UTC)
- Thanks Ryan. — Diannaa 🍁 (talk) 20:43, 11 May 2018 (UTC)
- From looking at the proxy logs we were able to confirm that the traffic spike is coming from Earwig's Copyvio Detector. Earwig, could you look at the logs on your end and see if there's anything there that could be helpful in tracking it down. As I mentioned, the last spike was between 1 and 2am PST this morning. Ryan Kaldari (WMF) (talk) 22:34, 11 May 2018 (UTC)
- Thanks for investigating. Sure, I'll see what I can find in the logs tomorrow morning (just got home, a bit tired). — Earwig talk 02:19, 12 May 2018 (UTC)
- Replied at phab:T194541. — Earwig talk 21:13, 12 May 2018 (UTC)
- Thanks for investigating. Sure, I'll see what I can find in the logs tomorrow morning (just got home, a bit tired). — Earwig talk 02:19, 12 May 2018 (UTC)
- From looking at the proxy logs we were able to confirm that the traffic spike is coming from Earwig's Copyvio Detector. Earwig, could you look at the logs on your end and see if there's anything there that could be helpful in tracking it down. As I mentioned, the last spike was between 1 and 2am PST this morning. Ryan Kaldari (WMF) (talk) 22:34, 11 May 2018 (UTC)
- Thanks Ryan. — Diannaa 🍁 (talk) 20:43, 11 May 2018 (UTC)
- @The Earwig and Diannaa: It looks like we're hitting the daily quota every 5 days exactly due to a regularly timed spike. On April 26, May 1, May 6, and May 11, there were huge spikes in Google Search API usage from Tool Forge resulting in hitting the quota and then being denied service for the rest of the day. I'm going to file a Phabricator task to investigate further. Ryan Kaldari (WMF) (talk) 20:11, 11 May 2018 (UTC)
The Signpost: 24 May 2018
- From the editor: Another issue meets the deadline
- WikiProject report: WikiProject Portals
- Discussion report: User rights, infoboxes, and more discussion on portals
- Featured content: Featured content selected by the community
- Arbitration report: Managing difficult topics
- News and notes: Lots of Wikimedia
- Traffic report: We love our superheroes
- Technology report: A trove of contributor and developer goodies
- Recent research: Why people don't contribute to Wikipedia; using Wikipedia to teach statistics, technical writing, and controversial issues
- Humour: Play with your food
- Gallery: Wine not?
- From the archives: The Signpost scoops The Signpost
Copyright detector
Tool did not detect this case per here Wikipedia_talk:WikiProject_Medicine#Agenesis_of_superior_vena_cava
The page Agenesis of superior vena cava was entire copied from here https://journals.lww.com/md-journal/Fulltext/2018/06010/The_first_reported_case_of_factor_V_Leiden.1.aspx yet it missed it.
Best Doc James (talk · contribs · email) 20:19, 4 June 2018 (UTC)
- Doc James, I took a look. In this case, the tool searches Google for the right phrases, but Google does not return that page as result. Sometimes it seems their API is not as accurate as the regular web search us humans have access to. My general advice is that the tool can't detect everything: while a hit is a good sign that a copyvio might be present, the absence of a hit certainly does not mean an article is copyvio-free. — Earwig talk 02:14, 5 June 2018 (UTC)
- Interesting. Thanks for the follow up. Doc James (talk · contribs · email) 08:51, 5 June 2018 (UTC)
Women in Red tools and technical support
We are preparing a list of tools and technical support for Women in Red. I have tentatively added your name as you have provided general technical support, including tool developments. Please let me know whether you agree to be listed. You are of course welcome to make any additions or corrections.--Ipigott (talk) 07:29, 8 June 2018 (UTC)
- Sure Ipigott, I'm happy to help and to continue maintaining things as necessary. (Though I can't promise significant new features.) — Earwig talk 02:18, 9 June 2018 (UTC)
Notifying you of the requested move on this module, because it would affect one of EarwigBot's tasks. {{3x|p}}ery (talk) 21:54, 26 June 2018 (UTC)
- Thanks, I will comment there. — Earwig talk 02:42, 27 June 2018 (UTC)
The Signpost: 29 June 2018
- Special report: NPR and AfC – The Marshall Plan: an engagement and a marriage?
- Op-ed: What do admins do?
- News and notes: Money, milestones, and Wikimania
- In the media: Much wikilove from the Mayor of London, less from Paekākāriki or a certain candidate for U.S. Congress
- Discussion report: Deletion, page moves, and an update to the main page
- Featured content: New promotions
- Arbitration report: WWII, UK politics, and a user deCrat'ed
- Traffic report: Endgame
- Technology report: Improvements piled on more improvements
- Gallery: Wiki Loves Africa
- Recent research: How censorship can backfire and conversations can go awry
- Humour: Television plot lines
- Wikipedia essays: This month's pick by The Signpost editors
- From the archives: Wolves nip at Wikipedia's heels: A perspective on the cost of paid editing
Copyvio Bot on Punjabi Wikipedia
Hi @The Earwig: I am a Punjabi Wikipedia admin and I think the Copyvio Bot will be great addition on Punjabi Wikipedia. Besides, running it on new articles from now, can we also run the bot on existing articles on Punjabi Wikipedia as well ? Let me know if anything else in required. --Satdeep Gill (talk • contribs 07:29, 30 June 2018 (UTC)
- Hi Satdeep Gill. While I do have a tool to check for copyvios, I don't have a bot that does it automatically. The main reason is that checking for copyvios is slow and expensive (there is a daily limit of about 1,000 checks due to the data source we use), and there are enough false positives that I think humans should always review the results before they get shown to other people (like the article creator). See my response to a similar question here. — Earwig talk 14:36, 30 June 2018 (UTC)
- I totally agree that humans should check it. What we are looking for is to have it enabled and that the tool adds a template to articles that might have copyvio. --Satdeep Gill (talk • contribs 07:43, 1 July 2018 (UTC)
Thursday July 12: Wiki Loves Pride Edit-a-thon @ Jefferson Market Library
Thursday July 12, 5-8pm: Wiki Loves Pride Edit-a-thon @ Jefferson Market Library | |
---|---|
Wikimedia NYC invites you to attend a Wiki Loves Pride Edit-a-thon on Thursday, July 12th at Jefferson Market Library! Wiki Loves Pride is a global campaign to expand and improve LGBT-related content across all Wikimedia projects, in all languages. We are holding this year's event in July in order to support folx who want to contribute a photograph they took at one of NYC's many Pride events or edit an article about something they learned this June. Not sure what to contribute? No problem! We will have a list of articles that need your help.
--Megs (talk) 14:57, 10 July 2018 (UTC) P.S. You are also invited to the "picnic anyone can edit", the Great American Wiknic NYC @ Prospect Park, Sunday, July 29! |
(You can subscribe/unsubscribe from future notifications for NYC-area events by adding or removing your name from this list.)
Sunday July 29: Annual Wiki-Picnic @ Prospect Park
Sunday July 29, 2-7pm: Annual Wiki-Picnic | |
---|---|
You are invited to join us the "picnic anyone can edit" in Brooklyn's green Prospect Park, as part of the Great American Wiknic celebrations being held across the USA. Remember it's a wiki-picnic, which means potluck.
We hope to see you there! --Pharos (talk) 08:24, 23 July 2018 (UTC) |
(You can subscribe/unsubscribe from future notifications for NYC-area events by adding or removing your name from this list.)
The Signpost: 31 July 2018
- From the editor: If only if
- Opinion: Wrestling with Wikipedia reality
- Discussion report: Wikipedias take action against EU copyright proposal, plus new user right proposals
- Featured content: Wikipedia's best content in images and prose
- Arbitration report: Status quo processes retained in two disputes
- Traffic report: Soccer, football, call it what you like – that and summer movies leave room for little else
- Technology report: New bots, new prefs
- Recent research: Different Wikipedias use different images; editing contests more successful than edit-a-thons
- Humour: It's all the same
- Essay: Wikipedia does not need you
Bots Newsletter, August 2018
Bots Newsletter, August 2018 | |
---|---|
Greetings! Here is the 6th issue of the Bots Newsletter. You can subscribe/unsubscribe from future newsletters by adding/removing your name from this list. Highlights for this newsletter include:
As of writing, we have...
Also
These are some of the discussions that happened / are still happening since the last Bots Newsletter. Many are stale, but some are still active.
Thank you! edited by: Headbomb 15:04, 18 August 2018 (UTC) (You can subscribe or unsubscribe from future newsletters by adding or removing your name from this list.) |
August 29: WikiWednesday Salon and Skill-Share NYC
Wednesday August 29, 7pm: WikiWednesday Salon and Skill-Share NYC | |
---|---|
You are invited to join the Wikimedia NYC community for our monthly "WikiWednesday" evening salon (7-9pm) and knowledge-sharing workshop at Babycastles gallery by 14th Street / Union Square in Manhattan. Is there a project you'd like to share? A question you'd like answered? A Wiki* skill you'd like to learn? Let us know by adding it to the agenda. We will also follow up on plans for recent and upcoming edit-a-thons, museum and library projects, education initiatives, and other outreach activities.
We especially encourage folks to add your 5-minute lightning talks to our roster, and otherwise join in the "open space" experience! Newcomers are very welcome! Bring your friends and colleagues! --Pharos (talk) 00:14, 29 August 2018 (UTC) |
(You can subscribe/unsubscribe from future notifications for NYC-area events by adding or removing your name from this list.)
The Signpost: 30 August 2018
- From the editor: Today's young adults don't know a world without Wikipedia
- News and notes: Flying high; low practice from Wikipedia 'cleansing' agency; where do our donations go? RfA sees a new trend
- In the media: Quicksilver AI writes articles
- Discussion report: Drafting an interface administrator policy
- Featured content: Featured content selected by the community
- Special report: Wikimania 2018
- Traffic report: Aretha dies – getting just 2,000 short of 5 million hits
- Technology report: Technical enhancements and a request to prioritize upcoming work
- Recent research: Wehrmacht on Wikipedia, neural networks writing biographies
- Humour: Signpost editor censors herself
- From the archives: Playing with Wikipedia words
Earwig Bot!
Heya, thanks for all the things ya do! I noticed the AfC bot is on strike. Hopefully y'all can settle this labor dispute :D I was gonna tinker with the bot run setting thingy, but didn't wanna bork it. Anywho, thanks in advance! Drewmutt (^ᴥ^) talk 17:27, 7 September 2018 (UTC)
- Thanks for letting me know, Drewmutt. I restarted him and he should be back to working now after a short delay. — Earwig talk 00:03, 8 September 2018 (UTC)
- Seems it is doing something unusual at Template:AFC statistics. Curb Safe Charmer (talk) 17:01, 10 September 2018 (UTC)
- @Curb Safe Charmer: what do you mean? — Earwig [alt] talk 18:09, 10 September 2018 (UTC)
- Yes, quite odd indeed.. here's how it looks to me.. Drewmutt (^ᴥ^) talk 19:10, 10 September 2018 (UTC)
- That is, unfortunately, expected behavior. The backlog is large enough that the status page is too long for MediaWiki to render all of it. We need more reviewers! — Earwig [alt] talk 23:15, 10 September 2018 (UTC)
- Dang. Well, until backlog drive season, can we make it simply link to the draft as opposed to having a somewhat useless invoke tag? Not sure if this helps the issue, or if that's even feasible. Drewmutt (^ᴥ^) talk 00:01, 13 September 2018 (UTC)
- I don't recommend that. It's not easy to tell in advance where the cutoff point is. For what it's worth, we're only losing about 15% of the page, and probably a fair bit of that are drafts that have already been declined/accepted. If you really want a list of every draft, there's always CAT:PEND. By the way, I've wanted to move the status page to Labs for a while so we don't need to deal with rendering it on-wiki, but I haven't had the time/desire to make that change yet. — Earwig talk 00:29, 13 September 2018 (UTC)
- Dang. Well, until backlog drive season, can we make it simply link to the draft as opposed to having a somewhat useless invoke tag? Not sure if this helps the issue, or if that's even feasible. Drewmutt (^ᴥ^) talk 00:01, 13 September 2018 (UTC)
- That is, unfortunately, expected behavior. The backlog is large enough that the status page is too long for MediaWiki to render all of it. We need more reviewers! — Earwig [alt] talk 23:15, 10 September 2018 (UTC)
- Yes, quite odd indeed.. here's how it looks to me.. Drewmutt (^ᴥ^) talk 19:10, 10 September 2018 (UTC)
- @Curb Safe Charmer: what do you mean? — Earwig [alt] talk 18:09, 10 September 2018 (UTC)
- Seems it is doing something unusual at Template:AFC statistics. Curb Safe Charmer (talk) 17:01, 10 September 2018 (UTC)
September 26: WikiWednesday Salon / Wikimedia NYC Annual Meeting
Wednesday September 26, 7pm: WikiWednesday Salon / Wikimedia NYC Annual Meeting | |
---|---|
You are invited to join the Wikimedia NYC community for our monthly "WikiWednesday" evening salon (7-9pm) and knowledge-sharing workshop at Babycastles gallery by 14th Street / Union Square in Manhattan. Is there a project you'd like to share? A question you'd like answered? A Wiki* skill you'd like to learn? Let us know by adding it to the agenda. This month will also feature on our agenda, upcoming editathons, the organization's Annual Meeting, and Chapter board elections - you can add yourself as a candidate. We will include a look at the organization and planning for our chapter, and expanding volunteer roles for both regular Wikipedia editors and new participants. We will also follow up on plans for recent and upcoming edit-a-thons, museum and library projects, education initiatives, and other outreach activities.
We especially encourage folks to add your 5-minute lightning talks to our roster, and otherwise join in the "open space" experience! Newcomers are very welcome! Bring your friends and colleagues! --Pharos (talk) 20:44, 20 September 2018 (UTC) |
(You can subscribe/unsubscribe from future notifications for NYC-area events by adding or removing your name from this list.)
Copyvio Detector
Hi Ben; it seems that people using your Copyvio Detector are occasionally too quickly jumping to the conclusion that a Wikipedia article must have been taken from some external site when it's in fact the other way round. The text of Wikipedia articles that have been around for some time might appear on many websites, sometimes lacking appropriate attribution. So I wonder whether you might consider adding a caveat to the page of your tool - something like: "If the Wikipedia article was created some time ago, please check whether similar content on other websites might be based on the Wikipedia article before assuming a copyright violation on Wikipedia's side"? Gestumblindi (talk) 11:48, 29 September 2018 (UTC)
- That's reasonable, Gestumblindi, I'll add something similar. — Earwig talk 17:37, 29 September 2018 (UTC)
The Signpost: 1 October 2018
- From the editor: Is this the new normal?
- News and notes: European copyright law moves forward
- In the media: Knowledge under fire
- Discussion report: Interface Admin policy proposal, part 2
- Arbitration report: A quiet month for Arbcom
- Technology report: Paying attention to your mobile
- Gallery: A pat on the back
- Recent research: How talk page use has changed since 2005; censorship shocks lead to centralization; is vandalism caused by workplace boredom?
- Humour: Signpost Crossword Puzzle
- Essay: Expressing thanks
The Signpost: 28 October 2018
- From the editors: The Signpost is still afloat, just barely
- News and notes: WMF gets a million bucks
- In the media: Bans, celebs, and bias
- Discussion report: Mediation Committee and proposed deletion reform
- Traffic report: Unsurprisingly, sport leads the field – or the ring
- Technology report: Bots galore!
- Special report: NPP needs you
- Special report 2: Now Wikidata is six
- In focus: Alexa
- Gallery: Out of this world!
- Recent research: Wikimedia Commons worth $28.9 billion
- Humour: Talk page humour
- Opinion: Strickland incident
- From the archives: The Gardner Interview
Copyvio tool downtown
Hey Earwig, just wanted to let you know that Earwig's Copyvio Detector wasn't working for about half a day due to an issue with Google. It has been resolved and is working again. Sorry for the inconvenience. Kaldari (talk) 19:19, 31 October 2018 (UTC)
- Got it, thanks for letting me know. — Earwig [alt] talk 21:16, 31 October 2018 (UTC)
ArbCom 2018 election voter message
Hello, The Earwig. Voting in the 2018 Arbitration Committee elections is now open until 23.59 on Sunday, 3 December. All users who registered an account before Sunday, 28 October 2018, made at least 150 mainspace edits before Thursday, 1 November 2018 and are not currently blocked are eligible to vote. Users with alternate accounts may only vote once.
The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.
If you wish to participate in the 2018 election, please review the candidates and submit your choices on the voting page. MediaWiki message delivery (talk) 18:42, 19 November 2018 (UTC)
ZackBot 12
Regarding ZackBot 12, and ZackBot, how do I go about getting the bot flag on that account? --Zackmann (Talk to me/What I been doing) 18:58, 19 November 2018 (UTC)
- @Zackmann08: You should already have a bot flag on that account? It's been flagged since 2016. — Earwig talk 01:54, 20 November 2018 (UTC)
- Hmm... How do I get my edits tagged with the bot flag then? --Zackmann (Talk to me/What I been doing) 01:57, 20 November 2018 (UTC)
- @Zackmann08: Oh. You need to send a special parameter with each edit for the flag to be used. Your bot framework should have an option for it (if you’re using one). The raw API parameter is just “&bot=true” I think. — Earwig [alt] talk 17:06, 20 November 2018 (UTC)
- I tried that a while ago and got an error message that I needed to have the param assigned to my account. I'll re-investigate. :-) Thanks! --Zackmann (Talk to me/What I been doing) 17:40, 20 November 2018 (UTC)
- Also, when you get a chance, would love input on Wikipedia:Bots/Requests for approval/ZackBot 13. :-) --Zackmann (Talk to me/What I been doing) 20:09, 20 November 2018 (UTC)
- I tried that a while ago and got an error message that I needed to have the param assigned to my account. I'll re-investigate. :-) Thanks! --Zackmann (Talk to me/What I been doing) 17:40, 20 November 2018 (UTC)
- @Zackmann08: Oh. You need to send a special parameter with each edit for the flag to be used. Your bot framework should have an option for it (if you’re using one). The raw API parameter is just “&bot=true” I think. — Earwig [alt] talk 17:06, 20 November 2018 (UTC)
- Hmm... How do I get my edits tagged with the bot flag then? --Zackmann (Talk to me/What I been doing) 01:57, 20 November 2018 (UTC)
Template:Lc and Template:Lc1 merge
I'm wondering if you can provide some background on Template:Cfd2/sandbox? CfD is now bizarrely using a monospaced version at 110% size with a hyphen instead of the normal Template:Lc. The change proposed at the sandbox seems a great idea. --Bsherr (talk) 19:25, 26 September 2018 (UTC)
- Hi Bsherr, unfortunately, I have no recollection of that edit! It seems the change to make the text larger was done here, so you would probably want to ask Redrose64 before undoing that, but the hardcoding of monospace instead of the normal font has been in place for a long time. I'm not sure why, nor do I have a strong preference either way. — Earwig talk 02:15, 27 September 2018 (UTC)
- Thanks for the advice. I'm going to propose a change to just use Template:Lc or, in the alternative, to eliminate the monospaced font in favor of increasing the kerning. I'll let you know when I post should you like to comment. --Bsherr (talk) 21:54, 28 September 2018 (UTC)
- Done. The discussion is at Wikipedia:Templates for discussion/Log/2018 November 23#Template:Lc1. --Bsherr (talk) 21:50, 23 November 2018 (UTC)
- Thanks for the advice. I'm going to propose a change to just use Template:Lc or, in the alternative, to eliminate the monospaced font in favor of increasing the kerning. I'll let you know when I post should you like to comment. --Bsherr (talk) 21:54, 28 September 2018 (UTC)
The Signpost: 1 December 2018
- From the editor: Time for a truce
- Special report: The Christmas wishlist
- Discussion report: Farewell, Mediation Committee
- Arbitration report: A long break ends
- Traffic report: Queen reigns for four weeks straight
- Gallery: Intersections
- From the archives: Ars longa, vita brevis
December 19: WikiWednesday Salon and Skill-Share NYC
December 19, 7pm: WikiWednesday Salon and Skill-Share NYC | |
---|---|
You are invited to join the Wikimedia NYC community for our monthly "WikiWednesday" evening salon (7-9pm) and knowledge-sharing workshop at Fordham University's Lincoln Center campus in Manhattan, near Columbus Circle. Is there a project you'd like to share? A question you'd like answered? A Wiki* skill you'd like to learn? Let us know by adding it to the agenda. We will also follow up on plans for recent and upcoming edit-a-thons, museum and library projects, education initiatives, and other outreach activities.
We especially encourage folks to add your 5-minute lightning talks to our roster, and otherwise join in the "open space" experience! Newcomers are very welcome! Bring your friends and colleagues! --Wikimedia New York City Team 03:23, 13 December 2018 (UTC) |
(You can subscribe/unsubscribe from future notifications for NYC-area events by adding or removing your name from this list.)
The Signpost: 24 December 2018
- From the editors: Where to draw the line in reporting?
- News and notes: Some wishes do come true
- In the media: Political hijinks
- Discussion report: A new record low for RfA
- WikiProject report: Articlegenesis
- Arbitration report: Year ends with one active case
- Traffic report: Queen dethroned by U.S. presidents
- Gallery: Sun and Moon, water and stone
- Blog: News from the WMF
- Humour: I believe in Bigfoot
- Essay: Requests for medication
- From the archives: Compromised admin accounts – again
A email I sent.....
Message added 18:02, 2 January 2019 (UTC). It may take a few minutes from the time the email is sent for it to show up in your inbox. You can {{You've got mail}} or {{ygm}} template. at any time by removing the
— fr ❄ 18:02, 2 January 2019 (UTC)
- Replied. — Earwig talk 06:58, 3 January 2019 (UTC)
Copyvios
Copyvios is currently down, the connection times out. Is this related to the new workers? Best regards, Luke081515 01:19, 19 January 2019 (UTC)
Copyvio tool
Hi Earwig, your api documentation for the tool mentions that there is a global limit for requests using the search engine of 1000. I want to continue the task merlbot did until 2016, checking all new articles in dewiki for copyvios. From the stastics I calculated that these are around 300 articles per day, so pretty much. That's why I currently implemented the function without using the search engine (I don't want to consume so much of the limit, would be bad for other users), however the tool is much more effective with the search engine. Is there a way to extend the global limit? And is there a way to include Turnitin in the api request as well? I have not found anything in the api documentation about it. P.S.: Please ping me when you reply, I mostly do not look at enwiki. Best regards, Luke081515 02:03, 13 January 2019 (UTC)
- @Luke081515: Unfortunately I do not control the global limit, that's set by Google. However, I think it's fine if you enable the search engine for a while as a test. We can see whether it ends up making too many requests and disable it later if so. I planned to add Turnitin to the API, but haven't gotten around to it. You can access it separately, though; the URL should look like https://tools.wmflabs.org/eranbot/plagiabot/api.py?action=suspected_diffs&page_title=PAGE_TITLE&lang=de&report=1 I think. — Earwig talk 06:06, 13 January 2019 (UTC)
- Ok, thank you. I've now set
&use_engine=
totrue
. The bot will check any new articles that are not disambig pages or redirects, and runs every 30 minutes. If it's too much, please ping me and I will disable it again. Best regards, Luke081515 14:49, 13 January 2019 (UTC)- Is there a way to extend the limit? I'm planning to check also big insertions into dewiki, not only page creations. I know that the limit is on googles side, and I guess making it bigger would cost a bit money. I can imagine, that wmf or wmde would support this, can you tell me who is your current contact concerning the google api at wmf? Best regards, Luke081515 00:04, 20 January 2019 (UTC)
- That would be User:Kaldari. I’m fairly certain that there is no way to raise the limit, based on previous attempts to do so. You should try to tune down the request rate if we’re hitting it too frequently. Maybe there are some simple heuristics you can apply to ignore certain pages? — Earwig [alt] talk 00:17, 20 January 2019 (UTC)
- Is there a way to extend the limit? I'm planning to check also big insertions into dewiki, not only page creations. I know that the limit is on googles side, and I guess making it bigger would cost a bit money. I can imagine, that wmf or wmde would support this, can you tell me who is your current contact concerning the google api at wmf? Best regards, Luke081515 00:04, 20 January 2019 (UTC)
- Ok, thank you. I've now set
Possible copyvio tool bug report
I tried to check the page Dorothy Misener Jurney using Earwig's Copyvio Detector with its default settings, to examine URLs listed in the article. It reported about 2.0% violations. HOWEVER it didn't actually check one of the sources cited, https://shsmo.org/manuscripts/descriptions/womenmedia/essays/names/j/jurney/ If I tell it explicitly to do a URL comparison to that citation, I get a > 64% violation rate. I'm working on cleaning up the article, but I'm concerned that the URL didn't get checked initially. Mary Mark Ockerbloom (talk) 02:08, 21 January 2019 (UTC)
- Thanks for the bug report, Mary Mark Ockerbloom. It looks like that URL is causing the tool some trouble. The first time you ran the check, that page timed out before it could return any data, which gets shown as "0%". But when you did the direct comparison, it loaded fine, showing the potential match. Unfortunately there's not much we can do about this kind of situation, though I suppose the tool could indicate that error more clearly. — Earwig talk 03:23, 21 January 2019 (UTC)
- I would strongly encourage a clear and visible distinction between "0%" meaning "No copyvios found" and some other marker to indicate the page could not be examined... Thanks for your work on this tool. I've found it really useful & keep it bookmarked :-) Mary Mark Ockerbloom (talk) 03:38, 21 January 2019 (UTC)
Problem in City of Stonnington#Buses and probably other locations
@The Earwig:You appear to have past connection with template MetlinkBus which appears to have been renamed PTVBus in 2015 by yourself - not a problem in itself. In City of Stonnington#Buses there are six bus routes which use this template, with route 734 still working OK but the other 5 routes 624, 612, 623, 767 and 822 no longer working. Earlier this week the PTV put up a new version of their website [1] where a lot of earlier links are no longer working. The reason these routes are not working may have been caused by this or possibly the data has changed earlier as I had not looked at this article before today. Can you be of any assistance in this area? Fleet Lists (talk) 07:18, 25 January 2019 (UTC)
- I think I have solved the problem. I will try and d\fix it and let you know how I go.Fleet Lists (talk) 07:54, 25 January 2019 (UTC)
- I have made some changes to Module:PTVBus/data which seem to have solved the problem. I found another article which has a large number of this type of error but that will need to wait until another day to fix those. I was surprised to find that that module had not had changes made to it since late 2015.Fleet Lists (talk) 08:15, 25 January 2019 (UTC)
- OK. This was a while ago and I don’t remember the situation, so your guess is as good as mine as to what needs to be done here. Glad to hear you’ve mostly figured it out. — Earwig [alt] talk 14:18, 25 January 2019 (UTC)
- I have made some changes to Module:PTVBus/data which seem to have solved the problem. I found another article which has a large number of this type of error but that will need to wait until another day to fix those. I was surprised to find that that module had not had changes made to it since late 2015.Fleet Lists (talk) 08:15, 25 January 2019 (UTC)
Rejected AFC submissions and AFC statistics
Last year a new AFC review result, "rejected", was introduced. It is more severe, more final, than "declined" in that it doesn't give the submitter a path to improve and resubmit the draft. (A random example is User:Naveengrande/sandbox.)
Now that the reject option is being used, questions are arising about when it should be used, how much it's being used, whether it's being used properly, etc.
EarwigBot shows recently rejected submissions the same way as recently declined ones on Template:AFC statistics. It would be useful if one could distinguish the rejects on that page. Perhaps EarwigBot could display them in a different section from "declined", or with "rejected" in the notes column. Is something like that an enhancement you'd be willing to make? --Worldbruce (talk) 15:37, 28 January 2019 (UTC)
- @Worldbruce: Thanks for the suggestion and for letting me know about the new status. I added 'rejected' as a note for the declined section. It will take a while for the whole table to update, but freshly declined submissions should have it starting now. — Earwig talk 03:49, 29 January 2019 (UTC)
The Signpost: 31 January 2019
- Op-Ed: Random Rewards Rejected
- News and notes: WMF staff turntable continues to spin; Endowment gets more cash; RfA continues to be a pit of steely knives
- Discussion report: The future of the reference desk
- Featured content: Don't miss your great opportunity
- Arbitration report: An admin under the microscope
- Traffic report: Death, royals and superheroes: Avengers, Black Panther
- Technology report: When broken is easily fixed
- News from the WMF: News from WMF
- Recent research: Ad revenue from reused Wikipedia articles; are Wikipedia researchers asking the right questions?
- Essay: How
- Humour: Village pump
- From the archives: An editorial board that includes you
definitions.net
Hey,
You might want to look into adding definitions.net onto the Wikipedia mirror list, I've been going through Category:Articles with improper non-free content and quite a few of them, after looking at various archives, appear to be copied from Wikipedia, generating false copyvio reports.
Thanks,
SITH (talk) 16:30, 8 February 2019 (UTC)
- Thanks for the suggestion. Added. — Earwig talk 23:23, 10 February 2019 (UTC)
Copyvio Detector
Hi, I am unable to access at Copyvio Detector. It shows some "502 Bad Gateway" and "The server timed out". Please fix it. I think the main problem is the server speed getting slow. Xain36 (talk) 08:18, 16 February 2019 (UTC)
- Please look two threads up. — Earwig talk 17:12, 16 February 2019 (UTC)
Copyvio Detector not working
He Ben, the copyvio detector quit working a couple hours ago, with the page failing to load but not timing out. If I leave it spin long enough it shows a 502 Bad Gateway. Any assistance you can offer to get it working again would be most appreciated. Thanks, — Diannaa 🍁 (talk) 23:05, 1 February 2019 (UTC)
- It's working again! in fact it's zippy and full of pep. Thank you, — Diannaa 🍁 (talk) 01:04, 2 February 2019 (UTC)
- Well, I see some bizarre errors in the log that I've never seen before, like we're running out of memory. I'll see if I can defend against this for the future. — Earwig talk 01:16, 2 February 2019 (UTC)
- Hi Ben, the copyvio detector is not working. I'm not sure how long it's been down; it failed to load on my first attempt to use it this morning and it's been down for at least half an hour. Any assistance would be appreciated. Thanks, — Diannaa 🍁 (talk) 13:14, 13 February 2019 (UTC)
- I kicked it, think it's OK now. This looks like the same issue as before. Didn't have a chance to investigate then, but I'll try to do it later when I have some free time. — Earwig talk 13:42, 13 February 2019 (UTC)
- Thanks so much Ben. I don't know how I ever got along without this tool, so helpful for copyright cleanup. — Diannaa 🍁 (talk) 13:49, 13 February 2019 (UTC)
- Hi Ben. The page is once again failing to load :/ Could you please take a look? Thanks, — Diannaa 🍁 (talk) 02:36, 15 February 2019 (UTC)
- It looks like the bot is running - do you just mean the webpage? — xaosflux Talk 02:50, 15 February 2019 (UTC)
- There's two different tools. The reason I posted here is because Earwig's copyvio detector tool is not working. It spins for a while and then produces a 502 Bad Gateway. Eran's CopyPatrol is also failing to load; the last time I was able to use the page properly was at around 03:02 UTC. — Diannaa 🍁 (talk) 03:45, 15 February 2019 (UTC)
- This time it’s definitely not my fault! Toolforge has been experiencing an unlikely combination of issues that would bring down most tools using a database for anything. That’s presumably why CopyPatrol was affected too. I’m not sure when things will fully stabilize. I will kick it in a little bit, but I don’t know how long that will last. — Earwig [alt] talk 12:40, 15 February 2019 (UTC)
- Thanks. I have some cases that will be impossible to solve without your tool, and not having it triples the time it takes to do the checks, so anything you can do to keep it working in the interim would be appreciated. — Diannaa 🍁 (talk) 14:39, 15 February 2019 (UTC)
- This time it’s definitely not my fault! Toolforge has been experiencing an unlikely combination of issues that would bring down most tools using a database for anything. That’s presumably why CopyPatrol was affected too. I’m not sure when things will fully stabilize. I will kick it in a little bit, but I don’t know how long that will last. — Earwig [alt] talk 12:40, 15 February 2019 (UTC)
- There's two different tools. The reason I posted here is because Earwig's copyvio detector tool is not working. It spins for a while and then produces a 502 Bad Gateway. Eran's CopyPatrol is also failing to load; the last time I was able to use the page properly was at around 03:02 UTC. — Diannaa 🍁 (talk) 03:45, 15 February 2019 (UTC)
- It looks like the bot is running - do you just mean the webpage? — xaosflux Talk 02:50, 15 February 2019 (UTC)
- I kicked it, think it's OK now. This looks like the same issue as before. Didn't have a chance to investigate then, but I'll try to do it later when I have some free time. — Earwig talk 13:42, 13 February 2019 (UTC)
- Hi Ben, the copyvio detector is not working. I'm not sure how long it's been down; it failed to load on my first attempt to use it this morning and it's been down for at least half an hour. Any assistance would be appreciated. Thanks, — Diannaa 🍁 (talk) 13:14, 13 February 2019 (UTC)
- Well, I see some bizarre errors in the log that I've never seen before, like we're running out of memory. I'll see if I can defend against this for the future. — Earwig talk 01:16, 2 February 2019 (UTC)
(←) Just following up. Unfortunately, things on Labs are in even worse shape now, and there doesn't seem to be anything I can do to fix it myself. Will continue to keep an eye out, but I think I just have to wait for now. — Earwig talk 03:31, 16 February 2019 (UTC)
- Just a "thanks" for writing and supporting this tool. I turned to it today for a DYK check ... hope it's back soon! ☆ Bri (talk) 17:17, 16 February 2019 (UTC)
- Update: The issues will likely not be resolved until Tuesday at the earliest. — Diannaa 🍁 (talk) 17:34, 16 February 2019 (UTC)
- Well, I rewrote the tool to remove the dependency on the broken part of Toolforge. We seem to be OK for now. Since I'm not sure how this change will affect performance in general, I will continue to monitor things throughout the day. — Earwig talk 19:26, 16 February 2019 (UTC)
- Update: The issues will likely not be resolved until Tuesday at the earliest. — Diannaa 🍁 (talk) 17:34, 16 February 2019 (UTC)
Forbidden error on earwig
Hi, I keep getting:
An error occurred while using the search engine (Google Error: HTTP Error 403: Forbidden). Try reloading the page. If the error persists, repeat the check without using the search engine.
When using Earwig's copyvio tool.
Any advice,
RhinosF1(chat)(status)(contribs) 21:48, 24 February 2019 (UTC)
- There's a daily limit on the number of searches with Google that was exceeded. It will reset at midnight. — Earwig talk 22:04, 24 February 2019 (UTC)
- Thanks, RhinosF1(chat)(status)(contribs) 22:12, 24 February 2019 (UTC)
- RhinosF1, I think it's at Midnight Pacific time, where Google's servers are located. — Diannaa 🍁 (talk) 00:55, 25 February 2019 (UTC)
- Thanks, RhinosF1(chat)(status)(contribs) 22:12, 24 February 2019 (UTC)
Quote Box
Have only used this tool recently and it seems great. Can I comment it does not seem to identify content within Template:Quote box in the article compare pane giving an increased risk of false positives unless the article is checked. If it is not possible to do this would it be advisable to indicate to users they need to manually check this? Thank you. Djm-leighpark (talk) 18:27, 25 February 2019 (UTC)
- @Djm-leighpark: That's strange, because I thought it did look inside quote boxes. Do you have an example page? I tried in my sandbox and it seems to work. — Earwig talk 02:28, 26 February 2019 (UTC)
- The 18:48 version of this page ... to be absolutely clear it matches the text the the quote in red however in the left compare pane the user (ie person runnning the tool) cannot see that it is inside a quote (without looking at the article). Issue is with the quote One of my proudest moments ... Amererica (by P. R. Brown) not being easily identifiable in a quote in the left hand pane. Hope it makes sense what I am trying to say. Thankyou.Djm-leighpark (talk) 03:21, 26 February 2019 (UTC)
- Oh, I see, you're saying that the text inside the quote box is not identified as being part of a quote. That's true. I think this falls under the general disclaimer that all results from the tool need to be manually reviewed. False positives can also come from inline quotes in the article text as well as things like book titles and long proper nouns, and detecting these would be difficult. — Earwig talk 03:48, 26 February 2019 (UTC)
- That's fair enough. I do wonder if the emphasis on the tool initiation page of Be aware that other websites can copy from Wikipedia, so check the results carefully, especially for older or well-developed articles without mention to do a manual check of the results for quotes can be misleading ... perhaps especially with articles such as Dead to the World Tour and [https://www.revolvermag.com/culture/marilyn-mansons-antichrist-superstar-story-behind-album-cover-art this source. Its just a thought from a user. One other though would be to change the submit button from active from once the tool is launched ... I've now got used to looking for the spinning working icon from the chrome browser but an active looking Submit button holds my eye and I am so tempted to press it again! Just of couple of thoughts. Thankyou. Djm-leighpark (talk) 04:19, 26 February 2019 (UTC)
- Those are reasonable suggestions, thank you. I'll see what I can do. — Earwig talk 02:05, 27 February 2019 (UTC)
- That's fair enough. I do wonder if the emphasis on the tool initiation page of Be aware that other websites can copy from Wikipedia, so check the results carefully, especially for older or well-developed articles without mention to do a manual check of the results for quotes can be misleading ... perhaps especially with articles such as Dead to the World Tour and [https://www.revolvermag.com/culture/marilyn-mansons-antichrist-superstar-story-behind-album-cover-art this source. Its just a thought from a user. One other though would be to change the submit button from active from once the tool is launched ... I've now got used to looking for the spinning working icon from the chrome browser but an active looking Submit button holds my eye and I am so tempted to press it again! Just of couple of thoughts. Thankyou. Djm-leighpark (talk) 04:19, 26 February 2019 (UTC)
- Oh, I see, you're saying that the text inside the quote box is not identified as being part of a quote. That's true. I think this falls under the general disclaimer that all results from the tool need to be manually reviewed. False positives can also come from inline quotes in the article text as well as things like book titles and long proper nouns, and detecting these would be difficult. — Earwig talk 03:48, 26 February 2019 (UTC)
- The 18:48 version of this page ... to be absolutely clear it matches the text the the quote in red however in the left compare pane the user (ie person runnning the tool) cannot see that it is inside a quote (without looking at the article). Issue is with the quote One of my proudest moments ... Amererica (by P. R. Brown) not being easily identifiable in a quote in the left hand pane. Hope it makes sense what I am trying to say. Thankyou.Djm-leighpark (talk) 03:21, 26 February 2019 (UTC)
Video tutorial regarding Wikipedia referencing with VisualEditor
Hi, I have received a grant from WMF to support production of a video tutorial regarding creating references with VisualEditor. I anticipate that the video will be published in March 2019. If this tutorial is well received then I may produce additional tutorials in the future for English Wikipedia and possibly other projects such as Commons and Spanish Wikipedia. If you would like to receive notifications on your talk page when drafts and finished products from this project are ready for review, then please sign up for the project newsletter.
Regards, --Pine✉ 00:30, 28 February 2019 (UTC)
The Signpost: 28 February 2019
- From the editors: Help wanted (still)
- News and notes: Front-page issues for the community
- Discussion report: Talking about talk pages
- Featured content: Conquest, War, Famine, Death, and more!
- Arbitration report: A quiet month for Arbitration Committee
- Traffic report: Binge-watching
- Technology report: Tool labs casters-up
- Gallery: Signed with pride
- From the archives: New group aims to promote Wiki-Love
- Humour: Pesky Pronouns
Project Tagging based on Category
Hi. I know that quite a few pages that should be tagged with the Children's Lit WikiProject banner lack them. I was wonder if articles lacking the project banner in the following two categories (inclusive) could be tagged: Category:Children's literature and Category:Young adult novels? Best, Barkeep49 (talk) 02:00, 28 December 2018 (UTC)
- From a cursory look, this should be possible. I'll let you know when I start/finish the task, or if I have any questions before I start, probably within the next couple days. — Earwig talk 03:11, 28 December 2018 (UTC)
- Just checking in on this. Thanks and Best, Barkeep49 (talk) 02:19, 13 January 2019 (UTC)
- Apologies for the delay, I had to do some work to migrate the bot to a new backend on Toolforge. I'll try to start this when I come home from work tomorrow. — Earwig talk 07:45, 14 January 2019 (UTC)
- @Barkeep49: Here's the full list of categories the bot will process (all subcategories recursively of those two you mentioned): User:The Earwig/Sandbox/Children's Lit. Can you help me look through this and remove anything that doesn't belong? It seems mostly OK, but there are some things I imagine we don't want to tag, like anything including "video game"... — Earwig talk 07:56, 15 January 2019 (UTC)
- I chopped a few hundred from the list - the project has generally covered derivative properties to some extent and so when that connection felt strong I left it but when it got too faraway from the original book (or if it was not a literary property to begin with), I removed it. I also removed many of the comic/manga categories as only a smaller percentage of those would be covered in our scope - its intended audience would have to be children or young adults which is not the case for a substantial percentage of comics/manga. Let me know if you have any other questions and thank you for your ongoing help with this. Best, Barkeep49 (talk) 18:04, 15 January 2019 (UTC)
- Excellent, that's exactly what I needed. The bot is running now. — Earwig talk 03:18, 16 January 2019 (UTC)
- Thanks. I'm abashed to admit I already knew this because an article on my watchlist got the banner... Thanks for all your assistance. Best, Barkeep49 (talk) 05:49, 16 January 2019 (UTC)
- @Barkeep49: I paused the task until I get home and can look a bit more carefully. I see we’ve been tagging films based on children’s books (see the bot’s recent contribs); I understand the consideration for derivative works, but do you think the relationships are clear enough in general to tag automatically? — Earwig [alt] talk 22:34, 16 January 2019 (UTC)
- Thanks. I'm abashed to admit I already knew this because an article on my watchlist got the banner... Thanks for all your assistance. Best, Barkeep49 (talk) 05:49, 16 January 2019 (UTC)
- Excellent, that's exactly what I needed. The bot is running now. — Earwig talk 03:18, 16 January 2019 (UTC)
- I chopped a few hundred from the list - the project has generally covered derivative properties to some extent and so when that connection felt strong I left it but when it got too faraway from the original book (or if it was not a literary property to begin with), I removed it. I also removed many of the comic/manga categories as only a smaller percentage of those would be covered in our scope - its intended audience would have to be children or young adults which is not the case for a substantial percentage of comics/manga. Let me know if you have any other questions and thank you for your ongoing help with this. Best, Barkeep49 (talk) 18:04, 15 January 2019 (UTC)
- @Barkeep49: Here's the full list of categories the bot will process (all subcategories recursively of those two you mentioned): User:The Earwig/Sandbox/Children's Lit. Can you help me look through this and remove anything that doesn't belong? It seems mostly OK, but there are some things I imagine we don't want to tag, like anything including "video game"... — Earwig talk 07:56, 15 January 2019 (UTC)
- Apologies for the delay, I had to do some work to migrate the bot to a new backend on Toolforge. I'll try to start this when I come home from work tomorrow. — Earwig talk 07:45, 14 January 2019 (UTC)
- Just checking in on this. Thanks and Best, Barkeep49 (talk) 02:19, 13 January 2019 (UTC)
Hello! I'm curious as to why Don Paterson has been tagged with the Children's Literature project banner. I don't associate him with children's literature, and nothing in the article or its categories seems to support this. Am I missing something obvious? --Deskford (talk) 20:41, 16 January 2019 (UTC)
- @Deskford: The connection comes from the Costa Book Awards; he is in the category of winners, which is in a category of children’s literary awards. This is an incorrect relationship, as the CBA does not look exclusive to children’s literature. I’ll corrrect this when I get home. — Earwig [alt] talk 21:41, 16 January 2019 (UTC)
- Ah, that makes sense. Thanks! --Deskford (talk) 21:54, 16 January 2019 (UTC)
- I've recently reverted EarwigBot's edits to Talk:Tommen Baratheon, Talk:Arya Stark, Talk:Bran Stark, and Talk:Rickon Stark, edits that added and WikiProject Children's Literature banner to the talk page. While the characters are children, A Song of Ice and Fire is definitely not children's literature, so I'm wondering why this happened. --TedEdwards 21:19, 16 January 2019 (UTC)
- @TedEdwards: Thank you for pointing that out. This is coming from Category:Child characters in literature, which is in Category:Children's literature, a clearly incorrect relationship. We’ll fix this. — Earwig [alt] talk 21:41, 16 January 2019 (UTC)
- Earwig anything I can do to be of assistance at this point? Best, Barkeep49 (talk) 02:06, 17 January 2019 (UTC)
- @Barkeep49: See my comment above in case it got lost; I think we should be a little more careful with the categories that pertain to derivative works like films. While some of those works might be in scope, there's a high enough false-positive rate that I don't think a bot determination is safe. If we pare down the list a bit more, I'll feel more comfortable restarting the task. I can also have the bot revert its taggings for certain categories that we decide were mistakes (like a couple of the ones mentioned above)—this has happened before, so I'm somewhat used to it and it's not a problem. — Earwig talk 03:25, 17 January 2019 (UTC)
- Just an update that this newsletter has been requested to go out and so hopefully I'll be able to get some help with this update soon. Best wishes, Barkeep49 (talk) 18:17, 13 February 2019 (UTC)
- @Barkeep49: See my comment above in case it got lost; I think we should be a little more careful with the categories that pertain to derivative works like films. While some of those works might be in scope, there's a high enough false-positive rate that I don't think a bot determination is safe. If we pare down the list a bit more, I'll feel more comfortable restarting the task. I can also have the bot revert its taggings for certain categories that we decide were mistakes (like a couple of the ones mentioned above)—this has happened before, so I'm somewhat used to it and it's not a problem. — Earwig talk 03:25, 17 January 2019 (UTC)
- Earwig anything I can do to be of assistance at this point? Best, Barkeep49 (talk) 02:06, 17 January 2019 (UTC)
- @TedEdwards: Thank you for pointing that out. This is coming from Category:Child characters in literature, which is in Category:Children's literature, a clearly incorrect relationship. We’ll fix this. — Earwig [alt] talk 21:41, 16 January 2019 (UTC)
@Barkeep49: So, I finished going through the bot's tagging and have reverted what I consider mistagged (by category, primarily non-written works or people/books with only dubious connections to children). This leaves about 4000 of the original 5000 taggings (for the first half of the category list). While idly spot-checking afterwards, I found unreverted yet questionable examples like Rush Limbaugh and Laura Bush that came from a category I hadn't thought to re-check: American children's writers. The problem is that often cats are used for non-defining classification, which isn't necessarily unreasonable—those people have published books for children—but I think you would agree that they aren't well known enough for that to place them within the project's scope? Maybe I am wrong, but it's enough that I'm nervous to rerun the bot, even on the new doubly reduced list. Hmm... — Earwig talk 04:47, 4 March 2019 (UTC)
- The Earwig I would agree we should have Rush Limbaugh and Laura Bush tagged and the issue of people who've sometimes written for children but not always (e.g. Gaiman) certainly caused concern the first time through. Where does that leave things then? Best, Barkeep49 (talk) 04:53, 4 March 2019 (UTC)
- I'm not sure. Some cats in the list should definitely be fine, if they exclusively contain in-scope works of literature, like Polish children's novels. I don't have a problem running the bot on these. In contrast, I don't feel comfortable running "Works based on"-type categories because these are often in other genres and only tenuously related (and the pages that are in-scope usually fall under another category anyway), so I'll probably remove these. Unfortunately that still leaves about 2/3 of the list. I'm not sure what to do with articles about people, which is a large number of them. I'm wondering if there is a reliable semi-automated test to decide whether a person is in-scope? I'm thinking of looking to see whether the article lead mentions "children", but I'm not sure how well this will work. — Earwig talk 05:05, 4 March 2019 (UTC)
- The Earwig For the categories which are troublesome are you able to just have the bot log where it would tag? I would then go through and remove the big red flags. In spot checking the first 50 A's in that category the hit rate was very high (only possible question marks would be Britt Allcroft E.J. Altbacker and Aubrey Ankrum and no clear cut nos like Limbaugh or Bush). Now that's for everyone so it includes people already tagged. Presumably the error rate for untagged people would be higher but in an essential category like American children's writers I really am wondering if it would be within a margin the project would find OK, especially as they will get rated (most of the activity that happens on the project is article assessment at the moment). Best, Barkeep49 (talk) 05:24, 4 March 2019 (UTC)
- I can definitely do that. I'll follow up over the next day or so. — Earwig talk 05:26, 4 March 2019 (UTC)
- Sorry that took so long, Barkeep49. I updated User:The Earwig/Sandbox/Children's Lit with the full list of untagged/unprocessed pages after running the bot through another 50 categories. — Earwig talk 07:49, 17 March 2019 (UTC)
- I can definitely do that. I'll follow up over the next day or so. — Earwig talk 05:26, 4 March 2019 (UTC)
- The Earwig For the categories which are troublesome are you able to just have the bot log where it would tag? I would then go through and remove the big red flags. In spot checking the first 50 A's in that category the hit rate was very high (only possible question marks would be Britt Allcroft E.J. Altbacker and Aubrey Ankrum and no clear cut nos like Limbaugh or Bush). Now that's for everyone so it includes people already tagged. Presumably the error rate for untagged people would be higher but in an essential category like American children's writers I really am wondering if it would be within a margin the project would find OK, especially as they will get rated (most of the activity that happens on the project is article assessment at the moment). Best, Barkeep49 (talk) 05:24, 4 March 2019 (UTC)
- I'm not sure. Some cats in the list should definitely be fine, if they exclusively contain in-scope works of literature, like Polish children's novels. I don't have a problem running the bot on these. In contrast, I don't feel comfortable running "Works based on"-type categories because these are often in other genres and only tenuously related (and the pages that are in-scope usually fall under another category anyway), so I'll probably remove these. Unfortunately that still leaves about 2/3 of the list. I'm not sure what to do with articles about people, which is a large number of them. I'm wondering if there is a reliable semi-automated test to decide whether a person is in-scope? I'm thinking of looking to see whether the article lead mentions "children", but I'm not sure how well this will work. — Earwig talk 05:05, 4 March 2019 (UTC)
Nomination for deletion of Template:List of crambid genera
Template:List of crambid genera has been nominated for deletion. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Zackmann (Talk to me/What I been doing) 21:33, 19 March 2019 (UTC)
The Signpost: 31 March 2019
- From the editors: Getting serious about humor
- News and notes: Blackouts fail to stop EU Copyright Directive
- In the media: Women's history month
- Discussion report: Portal debates continue, Prespa agreement aftermath, WMF seeks a rebranding
- Featured content: Out of this world
- Arbitration report: The Tides of March at ARBCOM
- Traffic report: Exultations and tribulations
- Technology report: New section suggestions and sitewide styles
- News from the WMF: The WMF's take on the new EU Copyright Directive
- Recent research: Barnstar-like awards increase new editor retention
- From the archives: Esperanza organization disbanded after deletion discussion
- Humour: The Epistolary of Arthur 37
- Op-Ed: Pro and Con: Has gun violence been improperly excluded from gun articles?
- In focus: The Wikipedia SourceWatch
- Special report: Wiki Loves (50 Years of) Pride
- Community view: Wikipedia's response to the New Zealand mosque shootings
EarwigBot not working
It hasn't edited for 3 days (I noticed it wasn't working when task 3 (creating AfC categories) wasn't running). Just wanted to let you know in case you weren't already aware. Thanks, --DannyS712 (talk) 04:15, 14 April 2019 (UTC)
- Thanks for letting me know. It should be back up now, and I think I've fixed the auto-restart so this should be prevented in the future. — Earwig talk 06:19, 14 April 2019 (UTC)
- @The Earwig: It still hasn't edited yet... --DannyS712 (talk) 06:20, 14 April 2019 (UTC)
- It's not supposed to yet. The AFC status page gets updated hourly, and the category creation runs nightly at 00:00 UTC. — Earwig talk 06:41, 14 April 2019 (UTC)
- Oh, okay. --DannyS712 (talk) 06:43, 14 April 2019 (UTC)
- It's not supposed to yet. The AFC status page gets updated hourly, and the category creation runs nightly at 00:00 UTC. — Earwig talk 06:41, 14 April 2019 (UTC)
- @The Earwig: It still hasn't edited yet... --DannyS712 (talk) 06:20, 14 April 2019 (UTC)
The Signpost: 30 April 2019
- News and notes: An Action Packed April
- In the media: Is Wikipedia just another social media site?
- Discussion report: English Wikipedia community's conclusions on talk pages
- Featured content: Anguish, accolades, animals, and art
- Arbitration report: An Active Arbitration Committee
- Traffic report: Mötley Crüe, Notre-Dame, a black hole, and Bonnie and Clyde
- Technology report: A new special page, and other news
- Gallery: Notre-Dame de Paris burns
- News from the WMF: Can machine learning uncover Wikipedia’s missing “citation needed” tags?
- Recent research: Female scholars underrepresented; whitepaper on Wikidata and libraries; undo patterns reveal editor hierarchy
- From the archives: Portals revisited
ArbCom 2019 special circular
Administrators must secure their accounts
The Arbitration Committee may require a new RfA if your account is compromised.
|
This message was sent to all administrators following a recent motion. Thank you for your attention. For the Arbitration Committee, Cameron11598 02:49, 4 May 2019 (UTC)
Administrator account security (Correction to Arbcom 2019 special circular)
ArbCom would like to apologise and correct our previous mass message in light of the response from the community.
Since November 2018, six administrator accounts have been compromised and temporarily desysopped. In an effort to help improve account security, our intention was to remind administrators of existing policies on account security — that they are required to "have strong passwords and follow appropriate personal security practices." We have updated our procedures to ensure that we enforce these policies more strictly in the future. The policies themselves have not changed. In particular, two-factor authentication remains an optional means of adding extra security to your account. The choice not to enable 2FA will not be considered when deciding to restore sysop privileges to administrator accounts that were compromised.
We are sorry for the wording of our previous message, which did not accurately convey this, and deeply regret the tone in which it was delivered.
For the Arbitration Committee, -Cameron11598 21:04, 4 May 2019 (UTC)
Question about copyvio detector functioning
Howdy - I just happened upon some startling behaviour in the copyvio detector, and wanted to ask whether this is a known thing or a fluke. Draft:Nathaniel Bartlett comes out squeaky-clean [2], but when running the tool on the identical draft once it was moved to mainspace, it finds the full-page copyvio [3]. The difference here must to be the AfC header, I guess... is that known behaviour? If the AfC header has this capacity to throw off copyvio detection, maybe it would be worth thinking about a function to strip it from an article before comparison? After all, AfC is probably one of the heaviest users of the tool - bit of a scary scenario. Cheers --Elmidae (talk · contribs) 22:01, 9 May 2019 (UTC)
- Hi Elmidae. The AfC header does not make a difference here—we already strip out templates from the article text before we start looking for matches. (The exact article text you see on the results page is what we try to find copies of, and in this case, you can see that neither include the template.) However, there is another difference: the one were we missed the violation has its categories rendered as normal wikilinks (prefixed with colons), and this makes them show up in the article text when normally they wouldn't. Because of an unlucky sequence of events, this is enough for us to fail to find the correct source. If you're interested in a more detailed explanation why, I've written up one below, but the main takeaway should be that this kind of outcome is always a risk because of how the tool works, but in general it should be uncommon enough that the tool remains useful.
- For the full explanation, I'll need to go into a bit of detail about how the tool finds possible sources. The problem it's trying to solve is that we have a large string of text and we need to query a search engine with that text to search for exact (or very close) matches. We can't paste the entire article into Google, because Google doesn't accept strings that large and it would miss cases where sentences are added or rearranged. Instead, we divide the article into chunks of text (about sentence length, 10-20 words), and search for each chunk independently, the idea being that at least one of them should be a near-verbatim copy of the plagiarized source (if one exists) and will give a hit. But the problem is that we can't search for every single chunk in a particular article, because an article might have hundreds of sentence-sized chunks of text, and Google limits the number of searches we can make per day, so we can only make up to 8 searches per article. This means we have the task of selecting about 8 representative sentences from throughout the article in hopes that at least one of them will contain the violation, if one is present. (We do this by picking a sentence from the start of the article, then the end, then the middle, then around the 25% mark, and so on, until we run out of text or reach 8 chunks.) For articles that are heavily copied, the odds of this working out are quite good, but we sometimes get very unlucky, like we did here. Because those wikilinks added text to the end of the article, our algorithm ended up picking 8 chunks for which not a single one returned the correct match in Google. (If you're curious what it searched for, I've reproduced below.)
Extended content
|
---|
Violation found:
Violation missed:
|
- Thinking about this further, I believe the tool should have stripped out the disabled category links as well, because despite being "article text", they should not appear in any sources. This is something I can add in the future. However, it's important to keep in mind that because of how the chunking logic works, as long as we don't search for every chunk, and we can't, there's always a chance that we could miss the violation. Something to keep in mind. Thanks. — Earwig talk 04:04, 10 May 2019 (UTC)
- Thank you, that's both informative and interesting! So in essence, it's a bit of potluck of whether a given selection of chunks contains detectable material; and a random frame-shift mutation (e.g. by adding a few lines of category text) may result in a selection that registers entirely differently. That's heuristics for you, I guess :) Cheers --Elmidae (talk · contribs) 14:08, 10 May 2019 (UTC)