Jump to content

Wikipedia talk:Large language models/Archive 7

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1Archive 5Archive 6Archive 7

LLM Experiment - Sources and unsourced information helper

I conducted an experiment to see if LLMs (Chat-GPT4) could identify sourced and unsourced information in a Wikipedia article. The experiment can be viewed step-by-step here. The Weather Event Writer (Talk Page) 03:05, 21 September 2023 (UTC)

Hello WeatherWriter and thanks for sharing your experiment. I think there is some potential in using LLMs to help with verification. One note regarding your experiment: the amount of history of which ChatGPT is aware is limited. This is the case even if you tell it to "Keep track of this source". So as the chat grows longer, it will forget the earlier steps in the exchange, which contain the text of the sources. But it does not tell you this and may attempt to answer your question nonetheless, probably with hallucinations. This issue becomes more serious with very long sources or very many sources. One way to solve this problem is to restrict oneself to the verification of one claim and one source a time and start a new chat for each new claim/source. Another issue is that your process only tells you which sources verify a claim but does not cite the passage that verifies it. So you would have to trust that it is not a hallucination instead of being able to check it for yourself.
For a user script that implements a similar idea, see User:Phlsph7/SourceVerificationAIAssistant. Phlsph7 (talk) 08:17, 21 September 2023 (UTC)
Great work! I did some follow up work with the 32k version of GPT4. Could fit the texts of all of the RS in and pose a single prompt for find unsupported material. See User_talk:WeatherWriter/LLM_Experiment_1 for more details. Nowa (talk) 07:38, 22 September 2023 (UTC)

The redirect Wikipedia:CHATGPT has been listed at redirects for discussion to determine whether its use and function meets the redirect guidelines. Readers of this page are welcome to comment on this redirect at Wikipedia:Redirects for discussion/Log/2023 October 2 § Wikipedia:CHATGPT until a consensus is reached. - CHAMPION (talk) (contributions) (logs) 10:22, 2 October 2023 (UTC)

Information page

Based on the discussion above, #RfC: Is this proposal ready to be promoted?, it is clear that this page will not be promoted either to policy or to guideline. However, the RfC didn't really specify what would happen to the page if it isn't promoted. In the absence of such clarity, I am considering swapping out the {{Proposal}} template with {{Information page}}. (An information page is a kind of project-space page that "intends to describe some aspect(s) of Wikipedia's norms, customs, technicalities, or practices", and it doesn't require much consensus for a page to be labelled as such.) Would there be any objection to that? It would be a softer template than something like {{failed}}, and I'm not sure if we have enough consensus to use {{supplement}}. Mz7 (talk) 03:43, 7 October 2023 (UTC)

I would object because this page doesn't describe any aspect of Wikipedia's norms, customs, etc. I would suggest it be labeled an essay (not a failed proposal), because it conveys the opinion of some editors on an issue relating to Wikipedia, and that's what an essay is. Levivich (talk) 05:34, 7 October 2023 (UTC)
I have no objection to {{essay}}. Mz7 (talk) 19:53, 7 October 2023 (UTC)
 Done: switched {{proposal}} to {{essay}} in the absence of objections. [1] The "essay" label doesn't really require much consensus to use, anyway, and as indicated above, it doesn't seem like there is consensus to use a different label for this page. Mz7 (talk) 21:37, 9 October 2023 (UTC)
Essay should be fine. Cheers, · · · Peter Southwood (talk): 12:11, 12 October 2023 (UTC)
I would prefer {{failed proposal}} to essay as long as the page remains substantially unchanged, since it was drafted as policy, is worded like one, and risks being brought up as semi-authoritative in the same way many essays are. If someone decides to substantially rewrite it into a more typical essay format, no objection to the essay tag being brought back. DFlhb (talk) 19:54, 14 October 2023 (UTC)
I took a first step to make it more essay-like by using less authoritative language. I tried to keep the changes to a minimum: most changes involve replacing expressions like "must" with "should". Phlsph7 (talk) 07:52, 15 October 2023 (UTC)

Paper

Good paper on the topic of LLM assisted writing and the kind of tools we might like to create if there are any software folk here. https://arxiv.org/pdf/2309.15337.pdf Talpedia 14:56, 9 November 2023 (UTC)

Copyright on that seems to be 2018, btw. - Dank (push to talk) 15:07, 9 November 2023 (UTC)
Hmm... odd the paper says "7 Sep 2023" in the margin and is using gpt-4 (which was released this year). Talpedia 18:02, 9 November 2023 (UTC)

AI being used by reliable sources

@ActivelyDisinterested has suggested a list in the Wikipedia: project namespace to collect notes on sources that we have considered reliable in the past, but which seem to be using LLM-generated content. Please see Wikipedia talk:Reliable sources#Use of AI content generation by previously reliable sources if you're interested in this subject and/or would like to help compile the list. WhatamIdoing (talk) 22:29, 1 December 2023 (UTC)

 You are invited to join the discussion at Wikipedia:Templates for discussion/Log/2023 December 4 § Template:OpenAI. –Novem Linguae (talk) 05:32, 4 December 2023 (UTC)

 You are invited to join the discussion at Wikipedia:Templates for discussion/Log/2023 December 4 § Template:AI-generated notification. –Novem Linguae (talk) 05:32, 4 December 2023 (UTC)

Drafts for 15,616 genes, only 9% wrong

As a demonstration of the potential for AI to impact scientific practice, we use WikiCrow to generate draft articles for the 15,616 human protein-coding genes that currently lack Wikipedia articles, or that have article stubs. WikiCrow creates articles in 8 minutes, is much more consistent than human editors at citing its sources, and makes incorrect inferences or statements about 9% of the time, a number that we expect to improve as we mature our systems.

https://www.futurehouse.org/wikicrow

Welp. Sandizer (talk) 18:41, 13 December 2023 (UTC)

Estimating risk

Raw chatgpt assist

Question to ChatGPT: "write a list of activities that can be assisted by LLMs, to write encyclopedias, in order of riskiness". ChatGPT's response appeared to have errors e.g. drafting likely more risky than language refinement, but ChatGPT said the opposite. "Assisting in the generation of citations and references" is almost halluncinatory?

ChatGPT response:Sure, it's important to note that language models like GPT-3.5, often referred to as large language models (LLMs), can be powerful tools but may not always provide accurate or up-to-date information. The riskiness of using LLMs for certain activities can vary, and it's crucial to verify the information obtained from these models. Here's a list of activities related to writing encyclopedias, ordered from lower to higher perceived risk: Gathering Basic Information: Summarizing established facts and information. Compiling lists of well-known data. Creating Outlines: Generating initial outlines for encyclopedia articles. Providing structure and key points for various topics. Generating Drafts: Assisting in drafting sections of encyclopedia entries. Offering ideas and context for historical events or scientific concepts. Language Refinement: Improving the readability and coherence of drafted content. Suggesting alternative phrasing for clarity. Explaining Concepts: Providing explanations for complex concepts in simple terms. Offering background information on specific subjects. Citing Sources and References: Assisting in the generation of citations and references. Helping with proper attribution of information. Fact-Checking: Cross-referencing information with reliable sources. Verifying the accuracy of historical or scientific details. In-Depth Research Assistance: Assisting in more comprehensive research on specific topics. Offering insights into less well-known or niche subjects. Handling Controversial Topics: Providing information on controversial issues with caution. Giving balanced perspectives and highlighting multiple viewpoints. Ethical Considerations: Assessing and addressing ethical concerns in content creation. Ensuring that information presented is unbiased and fair. Up-to-Date Information: Verifying the recency of information to ensure it is current. Handling topics that may evolve rapidly with care. Legal and Sensitivity Issues: Avoiding the generation of content that may violate laws or ethical guidelines. Exercising caution when dealing with sensitive or potentially harmful topics. Remember that while LLMs can be valuable tools, they are not infallible, and human oversight is essential. Critical evaluation, fact-checking, and ensuring compliance with ethical and legal standards should always be part of the content creation process.

Grateful for corrections. This list, bot-assisted, has been revised by a human. It estimates riskiness from least to most.

  • For talk pages
    • Gathering Basic Information. Summarizing established facts and information. Compiling lists of well-known data.
    • Generating outlines for articles. Providing structure and key points.
    • Offering ideas and context for historical events or scientific concepts.
    • In-Depth Research Assistance: Assisting in more comprehensive research on specific topics. Offering insights into less well-known or niche subjects.
  • Main space
    • Spell-checking
    • Language Refinement: Improving readability and coherence of drafted content. Suggesting alternative phrasing for clarity. Copyedit, reformulate, simplify.
    • Explaining Concepts: Providing explanations for complex concepts in simple terms. Offering background information on specific subjects.
    • Suggest images
    • Suggest wikilinks
    • Generating summaries e.g. lead summaries or other summaries
    • Suggest expansion
  • Higher risk:
    • Generating Drafts: Assisting in drafting sections of entries.
    • Fact-Checking: Cross-referencing information with reliable sources. Verifying the accuracy of historical or scientific details.
    • Up-to-Date Information: Verifying the recency of information to ensure it is current. Handling topics that may evolve rapidly with care.
    • Handling Controversial Topics: Providing information on controversial issues with caution. Giving balanced perspectives and highlighting multiple viewpoints.
    • Citing Sources and References: Assisting in the generation of citations and references. Helping with proper attribution of information.
    • Ethical Considerations: Assessing and addressing ethical concerns in content creation. Ensuring that information presented is unbiased and fair.
    • Legal and Sensitivity Issues

Tom B (talk) 16:23, 20 December 2023 (UTC)

 You are invited to join the discussion at Wikipedia talk:WikiProject AI Cleanup § Proposal: adopting WP:LLM as this WikiProject's WP:ADVICEPAGE. QueenofHearts 21:37, 9 January 2024 (UTC)

Someone implemented a full-on article generator, and Anthropic gave it an award

Blog post description, GitHub repo, based on this Stanford work, which also has a repo, and a live working demo.

Have people noticed those kind of articles? The outline structure is more distinctive and flamboyant than we usually see from human editors. 141.239.252.245 (talk) 05:43, 26 April 2024 (UTC)

Just saw this. Taking a look now. Thanks for posting. Nowa (talk) 12:07, 9 November 2024 (UTC)
I took a look at the tool and in its current state, it is unsuitable for drafting Wikipedia articles. The big problem is the text in the drafted articles is not necessarily supported by the cited references. Here is an example. See my comments at the bottom of the article. Another problem is that the LLM engages in wp:synth where it takes two facts stated in the references and extrapolates a conclusion from them. You can see an example of synth here. Again, scroll to my comments at the bottom.Nowa (talk) 23:24, 10 November 2024 (UTC)
Using the app a bit more, it's actually a helpful "search engine". The references are relevant to the topic of concern. So it might be useful for getting references for a Wikipedia article. Here is an example. Nowa (talk) 21:48, 11 November 2024 (UTC)