User talk:EranBot/Copyright/rc
Improvement requests
[edit]April 11, 2016. Please also see the tech community discussion.
Please add suggestions for improvement here. Some already being worked on are:
- Auto archiving of ones that have been addressed
- A third button that returns the item to unfollowed up
- Auto collection of data during the archiving process
- An auto refresh after pressing the button so that the new status is displaced
- Exclude list of facts. Exclude edits that do not contain the top-20 English words by frequency, see [1]. This should exclude basic lists of facts (false positives) that rarely need to use prepositions and conjunctions.
- Patrol draft and user spaces so EranBot can be integrated with current processes in the AfC community (replace CorenSearchBot?)
- Automatically and retroactively remove articles from the list if they have been deleted (e.g.,speedy deletion G12's)
Discussion
[edit]- Doc James,
- Archiving: What do you think would be the best approach for archiving:
- Create report pages by days, and than there is no need to archive, e.g User:EranBot/Copyright/rc/07-04-2015 (where User:EranBot/Copyright/rc shows the current day page)
- Have a User:EranBot/Copyright/rc as a queue of untreated edits, and once they are reported add them to User:EranBot/Copyright/rc/Archive 1 (Archive 2, etc...) where each archive have at most 50/100 edits (similar to talk pages).
- Auto refresh - Currently the page is a huge queue, so there is no efficient/easy way to partially refresh it (only the changed entries). I changed it to hide the report edit once you press TP/FP. I also changed the bot to add section headers between entries so it would be possible later to improve it (load only the relevant section).
- Archiving: What do you think would be the best approach for archiving:
- Eran (talk) 19:11, 7 April 2015 (UTC)
- Can we archive all the ones that have been answered after 48 hours? With each archive having 100 items. Doc James (talk · contribs · email) 19:50, 7 April 2015 (UTC)
- The Helicopter entry was True, but has been fixed by editors by now. How to solve the entry? Agree about need to sort entries. TGCP (talk) 14:03, 27 June 2015 (UTC)
- The True positive / False positive is an indication of if the bot did a "good" job or not rather than if the problem has been fixed.
- So flow process is currently that concerns start out unflagged. Than as they are dealt with they are flagged as either TP or FP. The hope would be to auto archive all the ones that are marked as either TP or FP. But that may still take some time. Doc James (talk · contribs · email) 17:38, 27 June 2015 (UTC)
Flag
[edit]How does one use this tool to flag a possible article with problems? Montanabw(talk) 02:46, 12 May 2015 (UTC)
- This tool works by looking at all new edits. One cannot specifically test an article with it. Doc James (talk · contribs · email) 03:56, 12 May 2015 (UTC)
- So it's random? Anything out there better than dup detector.... which becomes unwieldy after about 20 sources...? There is one other, but its restuls are not eacy to interpret... Montanabw(talk) 04:37, 13 May 2015 (UTC)
- Random, no. It basically takes all new edits over a certain size and runs them through turnitin's API. It does this within a few hour of the edits being made so that false positives are kept to a minimum. Doc James (talk · contribs · email) 04:50, 13 May 2015 (UTC)
- So it's random? Anything out there better than dup detector.... which becomes unwieldy after about 20 sources...? There is one other, but its restuls are not eacy to interpret... Montanabw(talk) 04:37, 13 May 2015 (UTC)
- This tool works by looking at all new edits. One cannot specifically test an article with it. Doc James (talk · contribs · email) 03:56, 12 May 2015 (UTC)
New pages feed
[edit]Special:NewPagesFeed is very functional. Was built by Ryan Kaldari from the WMF. Have asked for his help on this. Doc James (talk · contribs · email) 22:36, 8 June 2015 (UTC)
Does this bot recognize the blockquote command?
[edit]After this edit where a quote was added in quotation marks with a reference, the EranBot filled a report. Now I wondered if the EranBot should/could detect such an edit and rule this out, because it seems quite clear a quote is added here. -- Mdd (talk) 13:08, 15 June 2015 (UTC)
- Block quotes are a borderline case per Wikipedia:Non-free_content_criteria. If excessive quotes are being used they should be paraphrased / removed. Thus happy with the bot flagging these. Doc James (talk · contribs · email) 23:22, 15 June 2015 (UTC)
- Ok, thanks for your reply, I cannot disagree on that. -- Mdd (talk) 14:32, 16 June 2015 (UTC)
notifications
[edit]If you are going to run a bot that notifies editors of problems, perhaps it might be a good idea to actually tell them what the problems are. Despitr my many years here, I do make errors from time to time, as does anyone who edits extensively, I am always glad to be notified of anything I have done which might been an error, but I don't expect to have to chase down on other pages what the error might haver been.
If this bot run today was just a test, it accomplished its purpose: it showed that the bot needs fixing. If it's actually in production, it needs fixing all the more urgently. DGG ( talk ) 19:21, 20 June 2015 (UTC)
- (Copied from DGG talk page)) DGG, the bot flagged https://wiki.riteme.site/w/index.php?title=Ground-penetrating_radar&diff=667195529&oldid=667195195 I didn't deem it a copyvio, hence I only made you aware of the User:EranBot/Copyright/rc project and not a particular issue. But point taken.--Lucas559 (talk) 19:27, 20 June 2015 (UTC)
Understanding EranBot Output
[edit]When you click "compare" in the EranBot list you are directed to an iThenticate report. To understand this output fully, watch this 4 min video: http://www.ithenticate.com/training/dv-walkthrough . --Lucas559 (talk) 22:08, 20 June 2015 (UTC)
Bot Stalled
[edit]he:User:ערןThe bot might require a reboot. Despite re-caching and a different browser, I cannot get results after June 17. Thanks. --Lucas559 (talk) 16:33, 21 June 2015 (UTC)
Public domain
[edit]Does this bot take into account that there are works that are in the public domain and can be used without it being a copy-vio? Ruigeroeland (talk) 06:21, 22 June 2015 (UTC)
- User:Ruigeroeland yes the bot can. We have a whitelist of sites that the bot ignores here User:EranBot/Copyright/Blacklist. Please add sites to it that are mostly content that is okay. Thus the bot can get better over time.
- User:Eran both bots use this whitelist yes? Also can we move this to the name whitelist rather than blacklist as it is actually a whitelist. Doc James (talk · contribs · email) 13:09, 22 June 2015 (UTC)
- Most works found on the Biodiversity Heritage Library [2] are scanned versions of 'out of copyright' material, so I guess this site can be added to the list? Same for the Internet Archive [3] and the African Journal Archive [4] Ruigeroeland (talk) 13:15, 22 June 2015 (UTC)
- User:Eran how are the blacklisted sites [5] excluded from Ithenticate? They frequently appear in the "report" generated from Ithenticate (for example, medlibrary.org or plumb.com). They do not seem to show up as "compare" (Earwig).--Lucas559 (talk) 21:51, 5 July 2015 (UTC)
- This bot should be excluding them.Doc James (talk · contribs · email) 06:13, 6 July 2015 (UTC)
Bot has a date problem
[edit]I've only reviewed a few reports but many of the false positives I've seen appear to be coming from partial or complete (unauthorized?) Wikipedia mirrors. I see that the report has dates/times for when the text in question was trawled. Is there a way to compare the date of the offending text and the date that it was added to Wikipedia and automatically remove cases when it was added to Wikipedia first? Would dramatically cut down on the false positive rate. Axem Titanium (talk) 14:46, 26 June 2015 (UTC)
- Specifically looking at this edit from June 16,2015 [6] the content is copied from another Wikipedia page. Technically User:Sergecross73 should have mentioned this in his edit summary to fullful our attribution requirements which would make follow up easier. Thus technically it is a TP.
- We are building a list of websites the bot ignores here that will improve this issue Doc James (talk · contribs · email) 17:44, 26 June 2015 (UTC)
- Sorry about that, I rarely, if ever, do article splits like this, and didn't realize a note like that was required. Is there anything I should do now? Make a null edit with an edit summary or comment on the talk page or something? Sergecross73 msg me 17:53, 26 June 2015 (UTC)
- It is not a big deal. You can add a comment to the talk page. Doc James (talk · contribs · email) 18:00, 26 June 2015 (UTC)
- It's a false positive because the external website that the bot identified in its report was a mirror of Wikipedia content. The bot did not manage to identify that text was copied from one Wikipedia article to another Wikipedia article at all and that's a failing of the bot, not a true positive. Axem Titanium (talk) 21:38, 26 June 2015 (UTC)
- Yes the bot did not identify the original source of the text agree. Doc James (talk · contribs · email) 22:51, 26 June 2015 (UTC)
- It's a false positive because the external website that the bot identified in its report was a mirror of Wikipedia content. The bot did not manage to identify that text was copied from one Wikipedia article to another Wikipedia article at all and that's a failing of the bot, not a true positive. Axem Titanium (talk) 21:38, 26 June 2015 (UTC)
- It is not a big deal. You can add a comment to the talk page. Doc James (talk · contribs · email) 18:00, 26 June 2015 (UTC)
- Sorry about that, I rarely, if ever, do article splits like this, and didn't realize a note like that was required. Is there anything I should do now? Make a null edit with an edit summary or comment on the talk page or something? Sergecross73 msg me 17:53, 26 June 2015 (UTC)
Does the bot look at categories and freely-licensed tags?
[edit]Does the bot look for tags such as {{CC-notice}}, {{dual}}, {{CRS}}, {{CWR}}, {{DANFS}}, {{DNB}}, {{ACMH}}, {{Catholic}}, {{Include-USGov}}, {{USGS}},
{{SmithDGRBM}}, {{Citizendium}}, {{Gray's}}, {{wikisource}}, {{Appletons}}, {{Cite Appletons'}},
{{1911}}, {{InterPro content}}, {{USGovernment}}, {{CongBio}}, {{NIST-PD}}, and {{USGS-gazetteer}}? Does it look at the talk page for templates such as {{Text release}}, {{ConfirmationOTRS}}, {{OTRS pending}}, and {{OTRS permission}}? Does it avoid flagging pages if the talk page or article is in Category:Items pending OTRS confirmation of permission, Category:Items with OTRS permission confirmed, or Category:Articles with imported freely licensed text? --Ahecht (TALK
PAGE) 04:00, 3 July 2015 (UTC)
- No this bot only flags possible issues. It is for humans who are following it up to make the finer judgement calls. It however does exclude a lot of known Wikipedia mirrors and some sites that are known to contain PD material so I guess it does do some of the above but in a slightly differnt way. Doc James (talk · contribs · email) 04:35, 3 July 2015 (UTC)
Observations
[edit]I suggest copying the icons to the top of the page as a legend. While it doesn't take terribly long to figure out which each one means, there's no good reason why you should have to play detective.
On a related note I am troubled that there are only three options. When reviewing an article for a possible match, there far more than the three options:
- haven't checked it yet
- TP
- FP
It is possible that the 22 different icons used at copyright problems are overkill, but I think some of them ought to be included. For example, concluding that the text matches another source and noting that in OTRS is in progress is quite a bit different than identifying simply that yes there is a copyright violation. I understand this is a work in progress and perhaps I misunderstand the point of this page but I'm trying to learn and a response to this will help had me in the right direction.--S Philbrick(Talk) 19:44, 3 July 2015 (UTC)
- Yes. We could have a spot for free text. Doc James (talk · contribs · email) 06:02, 4 July 2015 (UTC)
- Yes, I raised this at template talk:plagiabot row2, but I don't think anyone saw it. LeadSongDog come howl! 03:58, 6 July 2015 (UTC)
- Yes. We could have a spot for free text. Doc James (talk · contribs · email) 06:02, 4 July 2015 (UTC)
Could we have a list of articles checked?
[edit]Could the page have a list at the top showing all the pages in the current run? I'm mostly interested in pages where I'm familiar with the subject matter, and it's slow to scroll through all the reports. If there were a linked list at the top, I could see if there were any which I wanted to check up on easily, and get to them easily. Argyriou (talk) 18:03, 5 July 2015 (UTC)
- Argyriou it is not obvious, but there is a "[show]" button on the righthand side just before the first entry. It will show all the WikiProjects being tracked. You could then select your topics of interest and see if any copyvios are suspected. Does that help? --Lucas559 (talk) 18:37, 5 July 2015 (UTC)
- I'm not seeing it. The only "[show]" button is at the "Prior content (much not yet reviewed)" box. If I go up to User:EranBot/Copyright, there's a "[show]" button for "Details", but there's no list there, and only a list of medical articles. That said, a list of WikiProjects being tracked isn't nearly as useful as some sort of table of contents for the reports. Argyriou (talk) 21:20, 5 July 2015 (UTC)
- User:Argyriou You need to install the importScript I think by copy and pasting it here Special:MyPage/common.js
- Than the expand button should appear. Doc James (talk · contribs · email) 06:15, 6 July 2015 (UTC)
- I finally got this to work after several attempts. For some reason editing my common.js only worked when I turned off the JavaScript compatibility gadget, then the script itself only worked when I re-enabled it. LeadSongDog come howl! 04:00, 25 July 2015 (UTC)
- I'm not seeing it. The only "[show]" button is at the "Prior content (much not yet reviewed)" box. If I go up to User:EranBot/Copyright, there's a "[show]" button for "Details", but there's no list there, and only a list of medical articles. That said, a list of WikiProjects being tracked isn't nearly as useful as some sort of table of contents for the reports. Argyriou (talk) 21:20, 5 July 2015 (UTC)
Blacklist management
[edit]We don't seem to have any progress on tools for this. Wasn't the plan that we would be able to semi/auto-harvest FP mirror finds and feed them in to the blacklist? LeadSongDog come howl! 15:15, 6 July 2015 (UTC)
- User:LeadSongDog what do you mean? You mean add a button that automatically adds mirrors to the blacklist? Doc James (talk · contribs · email) 18:05, 9 July 2015 (UTC)
- @Doc James:Well, that might be a bit dangerous, but something that created a sorted list of problem URLs that had been marked as "FP Mirror" would make it a lot easier to manually maintain the blacklist.LeadSongDog come howl! 20:29, 9 July 2015 (UTC)
- User:LeadSongDog Okay so maybe a button that says "propose for blacklist" which than adds it to a list that human go over and add as appropriate to the actual list? Doc James (talk · contribs · email) 20:32, 9 July 2015 (UTC)
- Sure, that could work, though I'm not really keen on anything that requires special configuration to use. LeadSongDog come howl! 21:10, 9 July 2015 (UTC)
- Other suggestions? Doc James (talk · contribs · email) 22:19, 9 July 2015 (UTC)
- Sure, that could work, though I'm not really keen on anything that requires special configuration to use. LeadSongDog come howl! 21:10, 9 July 2015 (UTC)
- User:LeadSongDog Okay so maybe a button that says "propose for blacklist" which than adds it to a list that human go over and add as appropriate to the actual list? Doc James (talk · contribs · email) 20:32, 9 July 2015 (UTC)
- @Doc James:Well, that might be a bit dangerous, but something that created a sorted list of problem URLs that had been marked as "FP Mirror" would make it a lot easier to manually maintain the blacklist.LeadSongDog come howl! 20:29, 9 July 2015 (UTC)
- User:LeadSongDog what do you mean? You mean add a button that automatically adds mirrors to the blacklist? Doc James (talk · contribs · email) 18:05, 9 July 2015 (UTC)
Bot Stalled
[edit]he:User:ערןThe bot might require a reboot. Despite re-caching and a different browser, I cannot get results after July 3. Thanks. --Lucas559 (talk) 02:35, 9 July 2015 (UTC)
- he:User:ערן I re-started the bot a couple of days ago, but still no new output? Heading over to Hebrew wiki to see if there is an update. --Lucas559 (talk) 22:20, 30 August 2015 (UTC)
On-demand Article Checks
[edit]Two users (NeilN and Binksternet) recently request an on-demand version of the bot. Can we ask the bot to search particular articles that do not have recent changes? These old articles would have lots of potentially mirror content creating false positives, but there might be value and demand for such an option.--Lucas559 (talk) 16:40, 16 September 2015 (UTC)
- Exactly. I believe I could sort through the mirror noise to see whether an older article was copied. Binksternet (talk) 16:46, 16 September 2015 (UTC)
- +1 --NeilN talk to me 16:47, 16 September 2015 (UTC)
- NeilN and Binksternet , I should mention the Earwig tool [7] in case your are not familiar with it. That tool can easily be used for on-demand checks. It uses a different database of sites to check against, so provides different results than Eranbot (Eranbot includes Earwig results in its output, see the "compare" results).--Lucas559 (talk) 16:56, 17 September 2015 (UTC)
- +1 --NeilN talk to me 16:47, 16 September 2015 (UTC)
TP for already deleted G12?
[edit]I'm frequently seeing cases where the list shows the article title in red, because it has already been deleted G12, e.g. [8]. Is there any point in having human editors mark these TP, or could that be done by bot? LeadSongDog come howl! 16:40, 17 September 2015 (UTC)
- Agree. I do not see a need for humans to check the already deleted articles. I just skip over red links. To keep the bot's output relevant, perhaps on subsequent dumps it could automatically delete redlinks. It would need to look back at its last couple of samples and make those deletions. I would just leave it as is though and focus bot improvements in other directions (e.g., exclude list of facts).--Lucas559 (talk) 17:01, 17 September 2015 (UTC)
Eranbot as a Vandalism Detection Tool
[edit]I have recently found copy and paste vandalism on three articles. The vandals simply copy and paste over existing content, nothing remotely clever. In these cases, I wonder if it would be possible to cross-reference the new material with words in the article. Pattern detection software to spot likely vandalism. One instance was paragraphs about penguins added to a list of USA tornados. Surely some algorithm exists that could guesstimate that if the most common words in the added material (penguin(s)) has no match with existing content (tornado(s)), then it can be automatically reverted. Just throwing out an idea.--Lucas559 (talk) 17:25, 9 November 2015 (UTC)
Wikimania
[edit]As per the post on my talk page. Here are some thoughts on my experiences with the plagiarism bot.
- Right now I am the only person working on the postings. If and when the bot is running continuously, there will be more work than I alone can handle, even if I drop all other Wikipedia activities. We need to attract more people to help out.
- Looking at recent page history, it looks like I am checking the entries at the rate of about 30 per hour. Each editor who commits a copyright violation is notified via
{{uw-copyright}}
or{{uw-copyright-new}}
or personalized message. When something particularly egregious is found, there's a delay while further investigation is done. This might consist of checking the article and its history for further copyright violations, doing revision deletions, blocking users (typically corporate ones) whose usernames violate the username policy, etc. - In addition to the actual checking of entries on EranBot/Copyright/rc, a lot of time is spent communicating with users who wish to discuss further why their edits are not acceptable, telling them how to get an OTRS ticket, and so on. Some of the worst copyright violators are added to my list of people whose contribs I check daily for further copyright violations. I have four right now that I check daily and several others on a less-frequent rotation.
- Many of the copyright violations are corporate people who don't realize that it's not okay to copy material verbatim from their website to this wiki, and people from India, adding material copied from various websites. Most people who add copyvio have no idea that it's not okay to do so. — Diannaa (talk) 21:46, 20 March 2016 (UTC)
Backlog
[edit]It might or might not be a good idea, but has anyone thought of mass mailing all holders of the Rollback and Reviewer rights to enlist their help? I'm pretty sure that most of the thousands of those rights holders are no longer active, but it might achieve some results.
I will be at Wikimania again this year and I hope to be facilitating a discussion on article quality control. --Kudpung กุดผึ้ง (talk) 20:13, 27 April 2016 (UTC)
- User:Kudpung I will be giving a talk on the bot Jun 24th at 14:00 at Wikimania.[9] Doc James (talk · contribs · email) 02:18, 18 May 2016 (UTC)
Wireframes: Copyright detection suggestions looking for feedback
[edit]The Community Tech team at the Wikimedia Foundation was asked to help out with copyright detection in last year's wishlist survey. There are suggestions on Phabricator that would benefit from your feedback, if you'd be willing to go there and tell us what works and what doesn't work. /Johan (WMF) (talk) 16:14, 13 May 2016 (UTC)
- Just realized this makes it sound like the Community Tech team is doing all the work, when User:ערן should get the credit, of course. /Johan (WMF) (talk) 16:28, 13 May 2016 (UTC)
Adding UNESCO Biosphere content to whitelist
[edit]Hello! UNESCO has released the text of their Biosphere Reserve Directory pages under CC-BY-SA so that new articles can be created from them. I just created a page (Champlain-Adirondack Biosphere Reserve) according to the instructions from WikiProject UNESCO Create Biosphere Reserve Wikipedia Articles from UNESCO Descriptions. I noticed that the bot picked it up for possible copyright violation. To encourage others to participate in UNESCO's project without being flagged, could all of the biosphere reserve directory pages be added to your whitelist? Perhaps you could contact @User:John Cummings with questions (and John, have you had this come up before?) Thanks! --Lange.lea (talk) 16:54, 21 May 2016 (UTC)
- @Doc James: and @Lange.lea:, thanks very much for this, I didn't see this in any guidance but assumed it may get flagged by a bot :) --John Cummings (talk) 17:32, 21 May 2016 (UTC)
- Great! Thanks so much! --Lange.lea (talk) 11:48, 22 May 2016 (UTC)
Feedback on https://tools.wmflabs.org/plagiabot/
[edit]I am using a MacBook Pro, running Chrome on Mavericks. On my display, possibly because I have embiggened my fonts, the data in the "Diff" field is overlapping with that in the "Editor" field. Also, on the "compare" section, the side-by-side is not happening; the box on the right is empty and the data that should be in it is below the left hand box and pretty much unreadable, because the material underneath it is showing through. Also, page only lists 48 reports, where User:EranBot/Copyright/rc has 240. — Diannaa (talk) 02:37, 15 June 2016 (UTC)
- There's too much data to support very small screen sizes (or very large fonts). Maybe you can try reducing your font size a bit? Yes, it currently only lists 50 records at a time. We will add a "Load more" button at the bottom which will fetch another 50 results and so on. The loading takes some time because we fetch a lot of data. --NKohli (WMF) (talk) 04:08, 15 June 2016 (UTC)
- I made a fix. Take another look. -- NKohli (WMF) (talk) 04:42, 15 June 2016 (UTC)
- Right now I am the only person working on these bot reports. Anything you can do to make it better/easier would be great. It really needs to be as functional as possible on as many displays as possible if we hope to attract more patrollers. I think if the "Editor" column were moved over half an inch to the right that would solve the problem I am having with overlap. There appears to be 1.5 inches of margin over on the right that maybe could be reduced?
The "compare" boxes are now side by side and are not spilling out all over the place, thanks for that. Could they also be made taller, so that more of the content could be viewed at one time? That would make it easier to see the big picture and decide whether or not the copyvio material can be extracted or if the article has to be nominated for deletion. Right now I am getting a box about 2.5 inches tall, which is difficult to work with compared to https://tools.wmflabs.org/copyvios/, where the content fills the available space top to bottom (about 5.5 inches). — Diannaa (talk) 14:09, 15 June 2016 (UTC)
- I usually work on the oldest reports first, not the newest ones. Right now I cannot use https://tools.wmflabs.org/plagiabot/ to access the material I want to review, since there are 281 reports in the queue and I'm only seeing the newest 50. It's very rare that I am able to get the work caught up to that degree, because we are currently receiving nearly 100 reports per day, and on most days I don't have time to assess 100 reports. A "load more" button is great but if there's 300 reports in the queue I will have to hit it six times to access the oldest material, which is what I want to look at.
Reports that have been assessed will need to be removed from the feed and archived. Links to archives will have to be offered on the page somewhere. — Diannaa (talk) 14:29, 15 June 2016 (UTC)
- On getting more people to work here, I think that we just need to do a better job publicizing it. I frankly didn't know this existed until I asked Diannaa if there was anything I could do to help with all the copyvios she was complaining about having to deal with. —Compassionate727 (T·C) 18:05, 15 June 2016 (UTC)
- It's still very much a work-in-progress! We were planning to spread word when we reached a milestone, which we did this week. There's a comment about it here: Wikipedia_talk:Copyright_problems#new_CopyPatrol_tool_is_live. If you could let more people know about it, that would be awesome. Please use the tool on https://tools.wmflabs.org/copypatrol/ and not the one on https://tools.wmflabs.org/plagiabot/ as that will be retired pretty soon. -- NKohli (WMF) (talk) 19:12, 15 June 2016 (UTC)
- Am giving a talk about the tool in a couple of days at Wikimania.
- Can we get a button that switches it around and shows the items in reverse order? Should be fairly easy to do and would solve User:Diannaa issue.
- Doc James (talk · contribs · email) 20:17, 15 June 2016 (UTC)
- It's still very much a work-in-progress! We were planning to spread word when we reached a milestone, which we did this week. There's a comment about it here: Wikipedia_talk:Copyright_problems#new_CopyPatrol_tool_is_live. If you could let more people know about it, that would be awesome. Please use the tool on https://tools.wmflabs.org/copypatrol/ and not the one on https://tools.wmflabs.org/plagiabot/ as that will be retired pretty soon. -- NKohli (WMF) (talk) 19:12, 15 June 2016 (UTC)
- On getting more people to work here, I think that we just need to do a better job publicizing it. I frankly didn't know this existed until I asked Diannaa if there was anything I could do to help with all the copyvios she was complaining about having to deal with. —Compassionate727 (T·C) 18:05, 15 June 2016 (UTC)
- Right now I am the only person working on these bot reports. Anything you can do to make it better/easier would be great. It really needs to be as functional as possible on as many displays as possible if we hope to attract more patrollers. I think if the "Editor" column were moved over half an inch to the right that would solve the problem I am having with overlap. There appears to be 1.5 inches of margin over on the right that maybe could be reduced?
- We just added new filters for triage state, which should help in a couple ways. The default is to show open cases, so there are less cases on the page to scroll through. You can also choose "Page fixed cases" and "No action needed cases", which serve as the archives. There's also an option for showing all cases, if you need the entire history. What do you think? -- DannyH (WMF) (talk) 18:14, 16 June 2016 (UTC)
- Good. My only complaint now is that the tool doesn't sync very quickly with the page. —Compassionate727 (T·C) 19:04, 16 June 2016 (UTC)
- Very quickly with Copyright/rc page? Sync, as in? Showing up the same records should be near instantaneous but you won't see the reviews made on this page appear on the tool because the page does not write back changes to the database. -- NKohli (WMF) (talk) 19:38, 16 June 2016 (UTC)
- @NKohli (WMF): I was thinking of the tool syncing with the User:EranBot/Copyright/rc, but I assume the inverse is also true. This seems like it'll be a problem if we get more people working on this, as we'll start checking things that other people have already checked. —Compassionate727 (T·C) 15:37, 17 June 2016 (UTC)
- The idea is that once Copy Patrol is ready to be used full-time, people will just use that tool, instead of the on-wiki EranBot page. Do you think there would be a need to keep both tools active? -- DannyH (WMF) (talk) 17:57, 17 June 2016 (UTC)
- Greetings from Golden, B.C. I am examining the https://tools.wmflabs.org/copypatrol/?filter=open page.
- Hi, @Ninja Diannaa: I'll respond in-line. -- Danny
- The times listed for the diffs do not match the time of the actual edit. For example, Simla, West Bengal is listed at 2016-06-17 12:06, but the edit took place at 00:09, June 17, 2016. The times need to match.
- Thanks for pointing that out -- we just made a ticket for it. (T138098)
- Draft:PNTSDF is listed as PNTSDF. The Draft prefix needs to be included. — Ninja Diannaa (Talk) 02:11, 17 June 2016 (UTC)
- That's being worked on right now. (T137858)
- The problem I was experiencing with the data in the "Diff" field overlapping with that in the "Editor" field is not happening on the Toshiba laptop. Either someone has fixed it, or this issue is only happening on the Mac (I can't check this because I left my "good" computer at home). — Ninja Diannaa (Talk) 03:40, 17 June 2016 (UTC)
- Yes, that's been fixed. :)
- There's no way to go back and make a correction if I accidentally tick the wrong box when marking my assessment. — Ninja Diannaa (Talk) 03:53, 17 June 2016 (UTC)
- That's also being worked on now. You'll be able to tick the box again, and a little modal will pop up asking if you're sure that you want to undo your rating. (T134597)
- I always open the page history, but there's no link for that. — Ninja Diannaa (Talk) 04:20, 17 June 2016 (UTC)
- That's coming soon. (T137909)
- Articles that have been deleted should be automatically removed from the queue, as there's no longer a need to assess them for copyvio.
- That's a good idea; I made a ticket. For now, while we're still in development, we're going to keep the redlinked items, because they're an easy way for Niharika and Leon to test that the system is working. :) We'll remove them once the main dev work is over. (T138099)
- Reversing the order of the queue at https://tools.wmflabs.org/copypatrol/ will only help if the deleted articles are removed from the queue, as there's hundreds of them in the archives at User:EranBot/Copyright/rc/ that were never marked as assessed because at the time it seemed pointless to do so. — Ninja Diannaa (Talk) 13:20, 17 June 2016 (UTC)
- The deleted articles will be removed. We have to figure out a way to filter out the really old items that have already been checked using the on-wiki tool. I hope these answers were helpful; let me know what else we can do. :) -- DannyH (WMF) (talk) 17:57, 17 June 2016 (UTC)
- We also need to filter out the really old unchecked items from 2015 (batches 1 through 16) if we are going to be able to reverse the order of the queue. No one is going to have time to go back and check those old cases. — Ninja Diannaa (Talk) 05:34, 21 June 2016 (UTC)
- I'm glad you said that, it'll make things a little easier. :) -- DannyH (WMF) (talk) 16:01, 21 June 2016 (UTC)
- @NKohli (WMF): I was thinking of the tool syncing with the User:EranBot/Copyright/rc, but I assume the inverse is also true. This seems like it'll be a problem if we get more people working on this, as we'll start checking things that other people have already checked. —Compassionate727 (T·C) 15:37, 17 June 2016 (UTC)
- Very quickly with Copyright/rc page? Sync, as in? Showing up the same records should be near instantaneous but you won't see the reviews made on this page appear on the tool because the page does not write back changes to the database. -- NKohli (WMF) (talk) 19:38, 16 June 2016 (UTC)
- Good. My only complaint now is that the tool doesn't sync very quickly with the page. —Compassionate727 (T·C) 19:04, 16 June 2016 (UTC)
- We just added new filters for triage state, which should help in a couple ways. The default is to show open cases, so there are less cases on the page to scroll through. You can also choose "Page fixed cases" and "No action needed cases", which serve as the archives. There's also an option for showing all cases, if you need the entire history. What do you think? -- DannyH (WMF) (talk) 18:14, 16 June 2016 (UTC)
Button for "Article Was Deleted"
[edit]Hi, would it be possible to add a button for when the article has been deleted for non-copyvio (A7, etc) prior to being evaluated? Not having the sorcery needed to see deleted revs, I don't really want to categorize the article as CV/non-CV, so having a third button would allow that particular entry to still be closed be closed out. Could manually edit the wikitext I suppose, but since we've got buttons...
Thanks! CrowCaw 17:36, 27 August 2016 (UTC)
- We're not actually working from this page any more, as it has been superseded by https://tools.wmflabs.org/copypatrol. At that location, deleted articles are removed from the queue by a bot. — Diannaa (talk) 19:37, 27 August 2016 (UTC)
- Ah thanks! CrowCaw 20:31, 27 August 2016 (UTC)