ComfyUI
Original author(s) | comfyanonymous |
---|---|
Initial release | January 16, 2023[1] |
Repository | github |
Written in | Python |
License | GPLv3[2] |
Website | www |
ComfyUI is an open source, node-based program that allows users to generate images from a series of text prompts. It uses free diffusion models such as Stable Diffusion as the base model for its image capabilities combined with other tools such as ControlNet and LCM Low-rank adaptation with each tool being represented by a node in the program.
History
[edit]ComfyUI was released on GitHub in January 2023. According to comfyanonymous, the creator, a major goal of the project was to improve on existing software designs in terms of the user interface.[3] The creator had been involved with Stability AI but by 3 June 2024 that involvement had ended and an organization called Comfy Org had been created along with the core developers.[4] In July 2024, Nvidia announced support for ComfyUI within its RTX Remix modding software.[5] In August 2024, support was added for the Flux diffusion model developed by Black Forest Labs and Comfy Org joined the Open Model Initiative created by the Linux Foundation.[6][7] As of September 2024, the project has 50.6k stars on GitHub.[8]
Features
[edit]ComfyUI's main feature is that it is node based.[9][10] Each node has a function such as "load a model" or "write a prompt".[11] The nodes are connected to form a control-flow graph called a workflow.[12] When a prompt is queued, a highlighted frame appears around the currently executing node, starting from "load checkpoint" and ending with the final image and its save location.[11] Workflows commonly consist of tens of nodes, forming a complex directed acyclic graph.[12] Node types include loading a model, specifying prompts, samplers, VAE decoders, face restoration and upscaling models, LoRAs, embeddings, and ControlNets.[13] Workflows can be saved to a file, allowing users to re-use node workflows and share them with other users.[13][14][15] The file format for the workflows is in JSON and can be embedded in the generated images.[16] Users have also created custom extensions to the base system which are exposed as new nodes,[13][17] such as the extension for AnimateDiff, which aims to create videos.[18][19] ComfyUI has been described as more complex compared to other diffusion UIs such as Automatic1111.[20][21] A default node group is also included with the program.[11]
LLMVision extension compromise
[edit]In June 2024, a hacker group called "Nullbulge" compromised an extension of ComfyUI to add malicious code to it.[22] The compromised extension, called ComfyUI_LLMVISION, was used for integrating the interface with AI language models GPT-4 and Claude 3, and was hosted on GitHub. Nullbulge hosted a list of hundreds of ComfyUI users' login details across multiple services on its website, while users of the extension reported receiving numerous login notifications. vpnMentor conducted security research on the extension and claimed it could "steal crypto wallets, screenshot the user’s screen, expose device information and IP addresses, and steal files that contain certain keywords or extensions".
Nullbulge's website claims they targeted users who committed "one of our sins", which included AI-art generation, art theft, promoting cryptocurrency, and any other kind of theft from artists such as from Patreon. They claimed that they were "a collective of individuals who believe in the importance of protecting artists' rights and ensuring fair compensation for their work" and that they believed that "AI-generated artwork is detrimental to the creative industry and should be discouraged".[22]
References
[edit]- ^ comfyanonymous. "Initial commit". github. Retrieved 10 July 2024.
- ^ comfyanonymous. "LICENSE". github. Retrieved 10 July 2024.
- ^ comfyanonymous (18 May 2023). "ComfyUI is now 4 months old!". ComfyUI blog. Retrieved 11 July 2024.
- ^ "ComfyUI 作者团队成立 Comfy Org- DoNews快讯". DoNews.
- ^ Harper, Christopher (4 July 2024). "Nvidia's RTX Remix goes open source —chipmaker adds Rest API to interface with ComfyUI for AI remastering or generating new graphics in real time". Tom's Hardware. Retrieved 11 July 2024.
- ^ 田口和裕 (August 7, 2024). "画像生成AI「Stable Diffusion」の代替に? 話題の「FLUX.1」を試した (1/7)". ASCII.jp (in Japanese).
- ^ Wheatley, Mike (12 August 2024). "Linux Foundation's latest initiative aims to promote 'irrevocable' open-source AI models". SiliconANGLE.
- ^ comfyanonymous. "ComfyUI". github. Retrieved 10 July 2024.
- ^ Zhu, Andrew (2024). Using Stable Diffusion with Python: Leverage Python to control and automate high-quality AI image generation using Stable Diffusion. Packt Publishing. ISBN 978-1835084311.
ComfyUI is a node-based UI that utilizes Stable Diffusion. It allows users to construct tailored workflows, including image post-processing and conversions. It is a potent and adaptable graphical user interface for Stable Diffusion, characterized by its node-based design.
- ^ 故渊 (November 25, 2023). "7 年老显卡 GTX 1080 能跑,图片生成视频模型 Stable Video Diffusion 更新 - IT之家". ithome.
- ^ a b c 田口, 和裕. "画像生成AI「Stable Diffusion」使い倒すならコレ! 「ComfyUI」基本の使い方 (1/3)". ascii.jp (in Japanese).
- ^ a b Xue, Xiangyuan; Lu, Zeyu; Huang, Di; Ouyang, Wanli; Bai, Lei (2 Sep 2024). "GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI". arXiv:2409.01392 [cs.CL].
- ^ a b c Gal, Rinon; Haviv, Adi; Alaluf, Yuval; Bermano, Amit H.; Cohen-Or, Daniel; Chechik, Gal (2024). "ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation". arXiv:2410.01731 [cs.CL].
- ^ 白鲸出海 (23 May 2024). "一家成都游戏公司,做出了一款千万月访问量的AI图像产品-36氪". 36氪 (in Chinese).
- ^ 田口, 和裕 (27 March 2024). "Macで始める画像生成AI 「Stable Diffusion」ComfyUIの使い方 (3/5)". ascii.jp (in Japanese).
- ^ しらいはかせ (18 December 2023). "画像生成AIを使い倒す!「Stability Matrix」で使えるWebUIを紹介【生成AIストリーム】". Impress Watch (in Japanese).
- ^ 机器之心 (16 November 2023). "当韩国女团BLACKPINK进军二次元,清华叉院AI神器原来还能这么玩-36氪". 36氪 (in Chinese).
- ^ 新, 清士. "アニメの常識、画像生成AIが変える可能性「AnimateDiff」のすごい進化". ascii.jp (in Japanese).
- ^ Guo, Yuwei; Yang, Ceyuan; Rao, Anyi; Liang, Zhengyang; Wang, Yaohui; Qiao, Yu; Agrawala, Maneesh; Lin, Dahua; Dai, Bo (May 2024). "AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning". International Conference on Learning Representations. arXiv:2307.04725.
- ^ Phoenix, James; Taylor, Mike (2024). "AUTOMATIC1111 Web User Interface". Prompt engineering for generative AI: future-proof inputs for reliable AI outputs at scale (First ed.). Beijing Boston: O'Reilly. ISBN 978-1098153434.
Advanced users may also want to explore ComfyUI, as it supports more advanced workflows and increased flexibility (including image-to-video), but we deemed this too complex for the majority of use cases, which can easily be handled by AUTOMATIC1111.
- ^ Pérez-Colado, Iván J.; Freire-Morán, Manuel; Calvo-Morata, Antonio; Pérez-Colado, Víctor M.; Fernández-Manjón, Baltasar (8 May 2024). "AI Asyet Another Tool in Undergraduate Student Projects: Preliminary Results". 2024 IEEE Global Engineering Education Conference (EDUCON). pp. 1–7. doi:10.1109/EDUCON60312.2024.10578883. ISBN 979-8-3503-9402-3.
- ^ a b Maiberg, Emanuel (2024-06-11). "Hackers Target AI Users With Malicious Stable Diffusion Tool on GitHub to Protest 'Art Theft'". 404 Media. Retrieved 2024-06-14.