User:WP News 1.0 bot
This user account is a bot operated by Araesmojo (talk). It is used to make repetitive automated or semi-automated edits that would be extremely tedious to do manually, in accordance with the bot policy. The bot is currently inactive but retains the approval of the community. Administrators: if this bot is malfunctioning or causing harm, please block it. |
Summary
[edit]User page to support the potential creation of a bot to review news sites, summarize the news contained within them, collect the news into groups based on article categories, and then create summarized link groups for editors to be able to quickly review recent news about a topic.
Creator-Proposer
[edit]Reason for Creation
[edit]To support the rapid and efficient summarization and survey of news stories created each day for use by Portal:Current_events. The Covid-19 epidemic has created a somewhat unique opportunity in that editors are now looking at a fairly wide ranging selection of news sources for current events. However, they are almost entirely focusing on articles related to the epidemic, and with a little bit of support, would be able to also gather stories with numerous links in support of other concepts (US election, military issues, the recent Muslim issues in Europe, ect...). They would also be able to gather these stories from a broader perspective. Example: The Muslim issue is primarily covered by European news, yet Arabic news sites often have their own personal view.
Belief in Validity of Concept
[edit]- Bots have already been used for external scraping of data sources outside Wikipedia
- Bot does not "deep dive" into news sites
- Bot only skims top level pages
- Bot does not read articles
- Bot only touches each site once per day
- Bot does not "mass edit"
- Bot creates a single page each day
- Bot only summarizes links and stories
- Bot is not on any of the Frequency Denied Bots
- Bot does not appear to violate any of the known bot policies
Possible Name Conflict
[edit]- A bot named "User:NewsBot" appears to have a name
- Has no user page or information.
- Appears to have gone nowhere
- Seems to just be a placeholder
Operation
[edit]General Guidelines
[edit]- Bot is run once each day
- Bot accesses each news site once
- Bot explore only top level front page, not full article text
Method of operation
[edit]- Explore news sites
- While (news sites remain unexplored)
- Open news site from list below in User:WP_News_1.0_bot#Current_News_Sites_of_Interest
- Explore top links for news site
- Summarize major topics into line items with links to stories
- While (news sites remain unexplored)
- Perform summary operations
- Add links to single page on Wikipedia
- Summarize headlines into word tags or phrases
- Calculate frequency content on word use (IE: Covid-19 during most days of 2020 would be a high frequency tag)
- Create list of clickable tags to allow editors to explore sites with common stories
- Compare links to links from prior day
- Create map with clickable regions and bubbles showing relative increase in "new" stories
- Finish bot operations and cleanup any memory or other resident data necessary
User Interaction
[edit]- Users do not interact directly with bot
- Users interact only with objects created by bot
- Interaction with tags
- User clicks tag
- Frame populates showing links to news sites with relevant articles
- Bot does not explore articles itself
- Interaction with map
- User clicks on bubble
- Frame populates showing links to new articles from "today"
- Possibly expand into sub-frame with sub-hierarchy based on high frequency tags
- Bot does not explore articles itself
Current News Sites of Interest
[edit]- Summarized from Portal:Current_events
- Other news site suggestions are welcome
- Sites that need Google Translate to function will likely be lowest priority
https://www.nst.com.my/ (note, www.nst.com is a security camera company) - appears to be Malaysia
https://sea.mashable.com/ (also Malaysia)
https://www.aa.com.tr/en (Turkish)
https://translate.google.com/translate?sl=auto&tl=en&u=https%3A%2F%2Fwww.nrc.nl%2F
https://www.rte.ie/ (Ireland - possible issues with cookies)
https://www.ft.com/ (Financial Times - possible issues with cookies)
https://www.washingtonpost.com/
https://en.dailypakistan.com.pk/
https://translate.google.com/translate?sl=auto&tl=en&u=https%3A%2F%2Fnews.kompas.com%2F (Indonesia)
https://www.newindianexpress.com/
https://www.canberratimes.com.au/
https://www.thestar.com/ (Canada)
https://www.trtworld.com/ (France)
https://translate.google.com/translate?sl=auto&tl=en&u=https%3A%2F%2Fwww.ilmessaggero.it%2F (Italy)
https://www.stcatharinesstandard.ca/
https://www.goal.com/en (Soccer / Futbol news)
https://tolonews.com/ (Afghanistan)
https://www.todayonline.com/ (Singapore)
https://english.radio.cz/ (Czech)
https://bnr.bg/en (Bulgaria)
https://english.alarabiya.net/News (Middle East)
https://today.rtl.lu/ (Luxembourg)
https://www.dw.com/en/ (Deutsche Welle (DW) is Germany)
https://translate.google.com/translate?sl=auto&tl=en&u=https%3A%2F%2Fwww.beritasatu.com%2F (Indonesia)
https://www.channelnewsasia.com/ (Singapore)