Draft:SynthesizerV
Submission declined on 4 January 2025 by QEnigma (talk). This submission is not adequately supported by reliable sources. Reliable sources are required so that information can be verified. If you need help with referencing, please see Referencing for beginners and Citing sources.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
|
- Comment: In addition to the comments above, this article fails to meet the encyclopaedic writing standards and structure required. A possibly related draft (Draft:Synthesizer V) also exhibits similar issues. QEnigma talk 19:17, 4 January 2025 (UTC)
Synthesizer V, also referred to as SynthV is a vocal synthesis engine developed by Dreamtonics. A Technical Preview was announced August 8, 2018 and distribution began on August 19, 2018.[1] [2].[3] The full Production Release officially released on December 24, 2018.[4] Synthesizer V Studio was unveiled June 25, 2020 in a press release alongside voice databases Kotonoha Akane & Aoi and Saki.[5] AI support for Synthesizer V AI was released for Synthesizer V Studio as an update on December 25 alongside an update for Saki known as Saki AI.
Original author(s) | Kanru Hua |
---|---|
Developer(s) | Dreamtonics |
Initial release | August 19, 2018 |
Stable release | Synthesizer V Studio 1.11.2 Update
/ September 12, 2024 |
Operating system | Microsoft Windows macOS Linux |
Available in | Japanese, English, Spanish, Chinese |
Type | Voice synthesizer software |
License | Proprietary |
Website | dreamtonics |
Voice databases that were not recorded with the AI method are known as "Standard voice databases", which are generally recorded at the Dreamtonics studio in Tokyo.[6][7][8]
History
[edit]Kanru Hua stated on Twitter, the first line of code, the proto-proto libllsm (written in March 2015), eventually became part of Synthesizer V.[9][10]
Development for Synthesizer V began in 2017. Kanru Hua released a demo[11] using three vocals known tentatively at the time as ENG-F1, JA-F1 & MAN-M1. Synthesizer V made its first official debut on December 1, 2017.[12]
In February 2018, Kanru Hua posted a listening test to recieve feedback on a new singing pitch model for Synthesizer V.[13] In August 2018, Kanru Hua released a "Technical Preview" version of Synthesizer V[14]
On August 20, 2018, Kanru Hua released a survey asking for user feedback on the Technical Preview to be used for future improvements.[15]
On December 24, 2018, Dreamtonics released the Production Release edition of Synthesizer V for sale[16] with substantial improvements over the Technical Preview edition. [17]
In March 2019, Kanru Hua announced an application for licensed purchasers of Synthesizer V for early testing of the macOS edition.[18] This version officially released on March 12, 2019 [19] and all currently released voices were given macOS versions. [20]
On May 31, 2019 Kanru Hua, with DREAMTONICS, announced in a tweet that he was accepting applications for C++ software engineers to work on the next iteration of Synthesizer V [21] which later became known as Synthesizer V Release 2 or "SVR2" [22]. This would later on in 2020 become known as Synthesizer V Studio.
On April 9, 2020 it was announced that the second generation of Synthesizer V would be released soon, and said a demo of Chiyu using the new engine would be coming soon.[23] The demo was released on April 11.[24] Synthesizer V Studio Pro and Synthesizer V Studio Basic were formally announced on June 26 by AH-Software in a press release, as well as the voice databases Kotonoha Akane & Aoi and Saki.[25]
Animen's ANiCUTE store for international customers opened on July 12th and Synthesizer Studio Pro with voice databases Genbu & AiKO available for purchase on July 15.[26] Special discounts were made available for VIP members and purchasers of the 1st generation Synthesizer V editor.
On August 2, Dreamtonics opened a beta-test application for VST and Audio Units versions of Synthesizer V Studio to anyone that purchased the software.[27]
On February 4, 2022, according to AH-Software, the Synthesizer V series had been selling much more than expected within a year ever since it became compatible with AI.[28]
On February 28, 2023, Dreamtonics announced that Synthesizer V Studio would soon add Cantonese Chinese as its fourth supported language. This would allow the engine to support both voice libraries dedicated to the language as well as Cross-lingual Singing Synthesis. The company also announced the future support of rap vocals, showing a demo of a new male vocalist rapping in Mandarin Chinese and English. Support for Japanese rap was expected in the future.[29][30]
On March 2, Dreamtonics posted a response to fans' concerns with the implementation of Cantonese Chinese and noted that they were checking and fixing the issues with the demonstration clips as reported by the user base. They also noted that Synthesizer V Studio supported the input of lyrics in Jyutping, which was the 1993 version of the Cantonese spelling scheme. It was not equivalent to the X-SAMPA phonetic scheme above the lyric notes on the editor. The X-SAMPA phonetic scheme for a Chinese character was also not equivalent to the Pinyin reading of the character.[31]
On March 15, after receiving feedback in improving the song to be more in line with Cantonese songwriting habits, Dreamtonics replaced the Bilibili version of the debut video, which implemented corrections made to the male vocal's and Feng Yi's demos.[32][33]
The rap feature for English and Mandarin Chinese, and the implementation of Cantonese Chinese Cross-lingual Singing Synthesis was officially planned to be fully implemented in Version 1.9.0, with a beta version released on April 18. Dreamtonics mentioned that after receiving valuable feedback, they focused on refining pronunciation for an even better user experience. As for how it worked, they said that when the language is set to Cantonese, all Chinese lyrics will be sung with Cantonese pronunciation. If misread lyrics occurred, users can correct them by typing the romanized form in Jyutping directly. Although the phoneme set is largely based on Mandarin Chinese, several phonemes unique to Cantonese were incorporated.[34] [35]
Development of Neural Networks
[edit]Following the Synthesizer V Studio 1.2.0 Update on February 19, 2021, Kanru Hua announced a thread on his personal Twitter account about how the optimized neural network inference functions in the recent Synthesizer V updates.[36]
The following day, Kanru Hua elaborated more on the subject. Synthesizer V Studio 1.2 uses JIT-compiled quantized sparse Matrix-vector multiplication (MVM) kernels.[37] In his own words "artificial neural network boils down to a bunch of really simple arithmetic operations, e.g. a + b * x1 + c * x2 + ... But when you (purposefully) compose millions of these together, they can be turned into really complicated machines."[38]
He notes that in order to build a voice, he picks specialized values for the "a, b and c" that best represent the voice, and then plugs the values into millions of equations, These "a, b, c" values are called parameters. Linear algebra is used to help rather than writing each individual equation by hand to make matrices & vectors which are notations that aid with simple math in large batches. Many neural network models are composed of matrix-matrix multiplication. In the case of Synthesizer V, the bottleneck is matrix-vector multiplication, mainly used in a network that generates waveform samples which is known as the “neural vocoder”.[39] One of the challenges he notes is that not only having a large network to manage, the network needs to "run tens of thousands of times per second" to synthesize high quality audio in real time.[40]
Due to this, modern CPUs are needed as they are able to run at several "gigacycles" per second ("GHz"). This is on a similar order of magnitude as the number of operations per second above however, he notes that the margin is very tight. He states that because of this, not all CPU cycles can do "useful work" and presents additional challenges.[41] The goal here is to make this MVM operation perform as fast as possible on modern CPU systems.[42]
The following day, he elaborated further into the usage of Sparse Matrix-Vector Multiplication (SpMVM) for Synthesizer V's neural networks.[43] Kanru Hua states that out of the millions of parameters, many are redundant and thrown out without hurting sound quality which results in what is called a Sparse matrix.[44][45] Some parameters that are truly important can't be thrown away and if too many are removed from too many of the parameters, eventually the quality will drop. "The synthesized voice will sound more and more like from a walkie-talkie until it completely degrades into noise."[46]
The goal here is to remove the less contributing parameters carefully and remove as many as possible without hurting the quality. Typically over three-fourths of the parameters are thrown out if done properly.[47]
When executing the sparse neural network, the program needs to skip the parameters that were removed. This skipping process adds a sometimes expensive overhead. This aids in boosting the speed up to four-times the initial speed.[48] "Going sparse is an effective way to compress a neural network. If done right, it can still speed up execution by a few times, although this would require highly optimized code for SpMVM."[49]
Over the course of the following three days, Kanru Hua posted three additional threads further elaborating on the intricises of developing the neural networks.[50][51][52] After making matrices sparse, integers are quantized to scale the values down before doing MVM to make sure the result will be in the range, it was noted that if the addition or multiplication programming goes out of range, they could be wrapped back to the lower end of the range resulting in an overflow and the synthesized voice could potentially sound akin to that of a mistuned radio or be complete noise.[53]
He states that neural networks used in Synthesizer V AI come in many different sizes, some of which can be made sparse, some can not. The software is able to run on every x86 CPU since the Pentium 4 processor that was developed in 2004.[54]
External links
[edit]- Official website
- [1] SynthesizerV Wiki
Category:2018 software
Category:Music production software
Category:Japanese inventions
- ^ "Review: Dreamtonics Synthesizer V Studio Pro". 25 July 2024.
- ^ https://resource.dreamtonics.com/download/%7Ctitle=index - powered by h5ai v0.29.2 (https://larsjung.de/h5ai/)%7Cwebsite=resource.dreamtonics.com%7Caccessdate=2021-01-10%7Carchive-date=2021-01-12%7Carchive-url=https://web.archive.org/web/20210112022759/https://resource.dreamtonics.com/download/
- ^ https://twitter.com/khuasw/status/1027354095227523072?s=20
- ^ https://twitter.com/khuasw/status/1077189890180149248?s=20
- ^ https://twitter.com/ahsoft/status/1276351253073629185
- ^ https://twitter.com/OfficialVolor/status/1392293297847037954
- ^ "Q&A Livestream 1 - Synthesizer V SOLARIS project | Eclipsed Sounds". YouTube. 22 May 2021.
- ^ Cite error: The named reference
:2
was invoked but never defined (see the help page). - ^ "Synthesizer V 官方網站". 15 April 2020. Archived from the original on 2021-05-31. Retrieved 2021-04-11.
- ^ https://twitter.com/khuasw/status/1112006856283570177?s=20
- ^ "SoundCloud - Hear the world's sounds".
- ^ https://twitter.com/khuasw/status/936463709203062784?s=20
- ^ https://twitter.com/khuasw/status/969065287155888128?s=20
- ^ https://twitter.com/khuasw/status/1027354095227523072?s=20
- ^ https://twitter.com/khuasw/status/1031446551732707328?s=20
- ^ https://twitter.com/khuasw/status/1077189890180149248?s=20
- ^ https://twitter.com/khuasw/status/1078547103561920512?s=20
- ^ https://twitter.com/khuasw/status/1103616971822661632?s=20
- ^ https://twitter.com/khuasw/status/1105509415162007552?s=20
- ^ https://twitter.com/khuasw/status/1110499169486045184?s=20
- ^ https://twitter.com/khuasw/status/1134381595270320129?s=20
- ^ https://twitter.com/khuasw/status/1154988155256262657?s=20
- ^ "Sina Visitor System".
- ^ https://www.bilibili.com/video/BV1ba4y1x7pg
- ^ "新世代歌声合成ソフトウェアが登場!「Synthesizer Vシリーズ」 2020年7月30日発売|AHS(AH-Software)".
- ^ https://twitter.com/dreamtonics_en/status/1283386080251830272
- ^ https://twitter.com/dreamtonics_en/status/1289851729668747271
- ^ "待望の男性歌声データベース2種類がついに登場!『Synthesizer V AI Ryo』 『Synthesizer V AI Kevin』本日発売開始|AHS(AH-Software)".
- ^ "「歌声技术」Synthesizer V AI 技术预览:粤语与说唱合成 (2023)_哔哩哔哩_bilibili".
- ^ "Technical Demo - Cantonese Singing Synthesis (And More!)". YouTube. 28 February 2023.
- ^ "动态-哔哩哔哩".
- ^ http://www.bilibili.com/video/BV1zs4y1f7QJ - "Dreamtonics 已于 3 月 15 日将 Synthesizer V AI 粤语歌声合成技术预览的测试曲目更换为更加符合粤语歌曲创作习惯的版本,感谢各位创作者的关心与鞭策。未来 Dreamtonics 将陆续发布更多关于粤语歌声合成与跨语言合成的信息,敬请期待。"
- ^ "动态-哔哩哔哩".
- ^ "Synthesizer V Studio 1.9.0b1 Update: Rap, Cantonese and More | Dreamtonics株式会社". 18 April 2023.
- ^ "Dreamtonics Synthesizer V".
- ^ https://twitter.com/khuasw/status/1362799523156746240
- ^ https://twitter.com/khuasw/status/1363069116467138564
- ^ https://twitter.com/khuasw/status/1363073616821116931
- ^ https://twitter.com/khuasw/status/1363073620445192195
- ^ https://twitter.com/khuasw/status/1363073621741084680
- ^ https://twitter.com/khuasw/status/1363073622860992520
- ^ https://twitter.com/khuasw/status/1363073624010235905
- ^ https://twitter.com/khuasw/status/1363431631759925259
- ^ https://twitter.com/khuasw/status/1363431631759925259
- ^ https://twitter.com/khuasw/status/1363431633404108803
- ^ https://twitter.com/khuasw/status/1363431634670817287
- ^ https://twitter.com/khuasw/status/1363431636012986371
- ^ https://twitter.com/khuasw/status/1363431637506170882
- ^ https://twitter.com/khuasw/status/1363431640085635075
- ^ https://twitter.com/khuasw/status/1363828518686138375
- ^ https://twitter.com/khuasw/status/1364164654231031815
- ^ https://twitter.com/khuasw/status/1364774324599615494
- ^ https://twitter.com/khuasw/status/1364164660132372481
- ^ https://twitter.com/khuasw/status/1364774332874985474