[Free/Paid] A summary of each company's speech synthesis engine. Which software uses which engine
Jan. 17, 2021
Many text-reading software has been announced today.
However, when I listen to the voice of the text-to-speech software,
There is something like "Huh? Isn't this voice the same as other softwares?"
In fact, text-to-speech software requires a base speech synthesis engine.
So, even if the software name is different, if the voice synthesis engine is the same, the voice color is the same as well.
This time, we will introduce the speech synthesis engine that can be used for free and the speech synthesis engine that can be used when purchased.
Some will realize, "Oh, that software used this synthesis engine!"
Please look forward to it!
A free speech synthesis engine
Free sentence reading software is mainly
- Open JTalk
The voice synthesis library engine is used.
AquesTalk, developed by AQUEST, Inc.
The software that can read aloud in a so-called "slowly" voice is
All made with "Aques Talk".
Typical examples are stick reading and soft talk.
Since it is easy to create synthetic speech from text, it is used in a variety of situations from personal use to commercial products.
In addition to being used as a base for SofTalk and Stick Reading, it is also used for sampling UTAU default voice. Furthermore, it is also used as a guidance voice for home appliances such as telephones.
AquesTalk was first released on May 25, 2006. The development period is less than two years. (AquesTalk public exit)
The sound source is a genuine synthetic voice with no human inside, made by manually operating the parameters without depending on the recording.
January 2010 AquesTalk 2exit, the successor to AquesTalk, was announced.
It supports a wide range of platforms including smartphones such as Windows, Mac OS X, WinCE, iPhone, and Android. Recently, an independent microchip (hardware) called AquesTalk pico has appeared.
Quotation source: Encyclopedia of Nico Nico
API usage licenses and development libraries are sold separately.
For details, check the company website.
Open JTalk is a Japanese text-to-speech synthesis system developed at Nagoya Institute of Technology, Tokuda-Ri Lab.
It is open source distributed under the modified BSD license.
"Open JTalk" is used by textbooks. If you hear it one time and you will say "I've heard of it".
Speech synthesis engine that can be used for a fee
Paid speech synthesis engine
- IBM: Watson Text to Speech
- Google: Text to Speech
- Amazon: Polly
- Microsoft: SAPI5
There are many attractive plans such as free of charge up to tens of thousands of characters.
As for the above, demos etc. are provided on HP, and you can play and listen to the sound.
Speech synthesis engine is very difficult
This time I introduced the speech synthesis engine.
By using the speech synthesis engine, you can make your own text-to-speech software or customize it like you want.
However, when you try to use it, it is provided by API, so it is difficult to set it unless you can program it .
API is an abbreviation for "Application Programming Interface", which means "a program that can be shared by a program specialized for a certain function" or "a mechanism for sharing software functions". If the frequently used functions are prepared as APIs, there is no need to program from scratch. You can use the API as needed to develop efficiently.
In the case of Web API, the program is published on the Web and is used by calling it from the outside. Web APIs are published in various fields, but many of them are available free of charge.
For example, if you can get the latest information from other companies' websites using API, you can add new functions to your website or application and improve the service. In recent years, the level required for smartphone apps has increased, so it is common to use Web APIs in app development.
Quotation source: internet academy
Companies that offer text-reading software in paid versions have developed their own speech synthesis engines or are using the paid speech synthesis engines introduced this time.
"In the first place, why not make a speech synthesis engine?"
You might think, but this is not an easy task.
It will need a lot of researchers, developers, laborious processes that require money and work.
At the very least, it is difficult for individuals, and we recommend that you work at a company or research institution scale.
So, if you find it difficult to use the API, it is easier and intuitive to use paid text-reading software.
There are many types of text-to-speech softwares available today, from free to paid.
I'm sure you can find your favorite software.
Check this article for more detailst!
I hope this article helps you.
I 'm looking forward to seeing you again.
"Ondoku" is a Text-to-Speech service that anyone can use for free without installation. You can also use an extension ( Ondoku3-ChatGPT ) that allows you to talk with ChatGPT for free.