Jump to content

Qwen

From Wikipedia, the free encyclopedia

Qwen (also called Tongyi Qianwen) is a family of Large Language Models developed by Alibaba. In July of 2024 it ranked as the top Chinese language model in some benchmarks and third globally behind Anthropic and OpenAI.[1]

History

[edit]

Alibaba first launched a beta of Qwen in April 2023 under the name Tongyi Qianwen.[2] It was publicly released in September 2023 after receiving approval from the Chinese government.[3] In December 2023 it released its 72B and 1.8B models, While Qwen 7B was open sourced in August.[4][5] In June 2024 it launched Qwen 2 and in September it released some of its models as open source, while keeping its most advanced models proprietary.[6][7] It has also released several other model types such as Qwen-Audio, Qwen2-Math and Qwen-VL for vision.[8] In total, it has released more than 100 models as open source, with its models having been downloaded more than 40 million times.[7][9] Finetuned version of Qwen have been developed by enthusiasts, such as “Liberated Qwen,” developed by San Francisco-based Abacus AI, which attempts to remove all safety guardrails from the model.[10] In November 2024, QwQ-32B-Preview, a model focusing on reasoning similar to OpenAI's o1 was released as open source under the Apache 2.0 License.[11] QwQ has a 32,000 token context length and performs better than o1 on some benchmarks.[12]

Architecture

[edit]

Qwen 1 was based on llama developed by Meta AI with various modifications.[13] Qwen 2 employs a Mixture of experts.[14]

References

[edit]
  1. ^ Jiang, Ben (11 July 2024). "Alibaba's open-source AI model tops Chinese rivals, ranks 3rd globally". South China Morning Post.
  2. ^ Chiang, Sheila (11 April 2023). "Alibaba to roll out its rival to ChatGPT across all its products". CNBC.
  3. ^ Jiang, Ben (13 September 2023). "Alibaba opens Tongyi Qianwen model to public as new CEO embraces AI". South China Morning Post.
  4. ^ Fan, Feifei (2023-12-01). "Alibaba unveils new Tongyi Qianwen AI language model". global.chinadaily.com.cn.
  5. ^ Ye, Josh (August 3, 2023). "Alibaba rolls out open-sourced AI model to take on Meta's Llama 2". reuters.
  6. ^ Jiang, Ben (7 June 2024). "Alibaba says new AI model Qwen2 bests Meta's Llama 3 in tasks like maths and coding". South China Morning Post.
  7. ^ a b Kharpal, Arjun (19 September 2024). "China's Alibaba launches over 100 new open-source AI models, releases text-to-video generation tool". CNBC.
  8. ^ Franzen, Carl (8 August 2024). "Alibaba claims no. 1 spot in AI math models with Qwen2-Math". VentureBeat.
  9. ^ "Alibaba accelerates AI push by releasing new open-source models, text-to-video". Reuters. September 19, 2024.
  10. ^ Mims, Christopher (April 19, 2024). "Here Come the Anti-Woke AIs". WSJ.
  11. ^ 故渊 (2024-11-28). "阿里通义千问 QwQ 登场:开源 AI 推理新王,MATH 测试超 OpenAI o1 模型 - IT之家". www.ithome.com.
  12. ^ Wiggers, Kyle (27 November 2024). "Alibaba releases an 'open' challenger to OpenAI's o1 reasoning model". TechCrunch.
  13. ^ "Qwen Technical Report". 28 Sep 2023.
  14. ^ "Qwen2 Technical Report". 10 Sep 2024.
Cite error: A list-defined reference with the name "CNBC 19 September 2024" has been invoked, but is not defined in the <references> tag (see the help page).