Jump to content

User:JScherer-WMF/drafts/sandbox/2024-hackathon-research

From Wikipedia, the free encyclopedia

Session with MP

  • Sat down at a table and I walked him through a bunch of ideas.
  • Was referred to me because he's something of an infobox expert
  • Can we change the level of simplicity in AI summaries. Tune it for grade 3? Or grade 5? Or ESL?
  • Does it change over time or is it a one-time generator?
  • What if it only summarizes intro paragraphs?
  • Big question was curation. Can we have a "community check" affordance or some other way for communities to check these things?
  • Infoboxes were always meant to traverse content.
  • Any kind of automation in this space will be controversial and need to be "99.9%" of the way there before it's ready for primetime.
  • We could integrate this into suggested edits?
  • We could only show links that are alredy within articles because we know those are appropriate, but then they would violate style rules about having too many of the same link in an article at the same time.

Focus group

  • Search in article
    • Replicates a browser Ctl+F funciton but for less techy folks.
    • Pretty familiar feature in Reddit, Twitter, and elsewhere. The trick is to get it right where we have it on at the right times and have clear indicators when we're searching in an article or all of wiki
    • Native has already done this. Let's ask them about adoption to build a hypothesis.
    • It would be best if it was also a fuzzy search of some kind rather than an exact text match
  • AI summaries
    • We would need to be ultra confident before it goes live. We would need fail safes.
    • "Any chance" that it could introduce innacuracies would make it non-viable.
    • Editors like Vera already use GPT in their workflows, for example, by copying news articles into it, but they're still reading the source and the output of GPT before it goes anywhere.
    • Obv. would need an "AI generated" badge of some kind
    • Could be a check and commit workflow for editors rather than auto publishing.
    • Human verified check.
    • Text might be too long to be CC-0
    • Language barriers are always a consideration
    • What timeframe would it regen? Every click? Every number of edits? Every time frame. What time frame?
    • It would need to get re-checked.
    • How might we "open the query" a bit?
    • If we nail the community curated but machine generated, there are so many incredible opportunities from flash cards to integrations with other systems available.
      • Make a quiz
      • convo UI
  • Text to speech
    • A massive win for a11y
    • Might do visual things while listening, like browsing images, but probably wouldn't read a whole article while also listening to an article.
    • "I don't want to hear a robot"
    • Could even simulate voices of celebrities eventually. Imagine the Greta Garbo article read in a simulation of her voice.
  • Related in infoboxes
    • It looks like it's curated because everything in the content area is curated, but it's not. It's kind of a lie in that sense.
    • It would cause an "uproar"
    • Give me something more than just another link. There are already so many other links. What about images, for example?
    • Maybe triggering the function would make it clear that it's not curated.
    • We could do a "find related" button.
    • Would love to see something like wiki galaxy and have interactive data visualization.
    • See Dan Andrescu
    • Adding 4-5 more things to click on feels redundant.

Reflections

  • Search in general is a place where people are expecting machine-generated recommendations. The IA doesn't imply curation.