Complete Guide to Using VOICEVOX! Detailed Explanation from Features to Commercial Use of Free AI Text-to-Speech Software
Jan. 26, 2026

VOICEVOX is speech synthesis software that you install and use on a Windows, Mac, or Linux PC.
Using the latest AI-based speech synthesis technology, it can generate read-aloud audio by entering text, making it widely used among video producers and content creators.
A key feature is the ability to read text in the voice of "Zundamon," a character popular on YouTube and Nico Nico Douga.
In this article, we will provide a detailed explanation of the information necessary for those who want to start using VOICEVOX, including the features of the speech synthesis software VOICEVOX, how to install it on Windows, how to use it, and precautions for commercial use.
- Thinking of using VOICEVOX
- Looking for a way to read text aloud
In such cases, why not use this article as a reference to find the perfect speech synthesis software or reading method?
【Free / Commercial Use OK】Recommended Latest AI Text-to-Speech Service
There is a free reading service recommended for you if you want to create read-aloud audio!
That is the latest AI service, "Ondoku".
"Ondoku" is a reading service that can be used for free.
It can be used in any environment, from Windows, Mac, and Linux to iPhone and Android smartphones.
Usage is easy—just enter your text!
It can be used immediately without installation and allows for comfortable reading regardless of your PC's specifications.
Furthermore, "Ondoku" is OK for commercial use even for free!
Why not try making a video for free using the clear and realistic audio of "Ondoku"?
What is the speech synthesis software VOICEVOX? AI reading software explained
First, we will briefly explain VOICEVOX.
What kind of AI reading software is VOICEVOX?

The speech synthesis software VOICEVOX is text-to-speech software that utilizes deep learning technology via AI.
By entering Japanese text, you can use AI to read it aloud in character voices.
VOICEVOX is free-to-use software and is also available for commercial use.
However, appropriate credit notation is required for commercial use.
It can be used for personal use, monetization through video postings on YouTube or Nico Nico Douga, and use within companies, but it is also necessary to follow the terms of use for characters such as "Zundamon," "Shikoku Metan," and "Kasukabe Tsumugi."
High-quality speech synthesis technology using AI
AI speech synthesis technology has been advancing very rapidly in recent years.
VOICEVOX is one of the software programs that has adopted AI speech synthesis technology, allowing it to read with a more natural voice compared to conventional mechanical reading software.
The operation screen allows for intonation adjustment on a character-by-character basis; while it takes time and effort to adjust, it is possible to create highly expressive audio.
An emotional expression function is also included, allowing for changes in tone according to emotions such as joy, anger, or sadness.
Using character voices like "Zundamon," it is possible to produce audio content that attracts the listener's interest.
Multi-platform support
VOICEVOX supports three types of OS: Windows, Mac, and Linux.
Speech synthesis software that supports Linux is particularly rare, making it one of the few options for users producing videos or audio content in a Linux environment.
What are the features of VOICEVOX?

Capable of reading with many character voices
One of the features of VOICEVOX is the availability of various character voices with distinct personalities.
The most famous character is "Zundamon."
This character is provided as part of the Tohoku Zunko Project and is characterized by a cute, high-pitched voice.
In addition to Zundamon, many characters with different voice qualities and personalities, such as "Shikoku Metan," "Kasukabe Tsumugi," and "Namine Ritsu," are included.
Each character has detailed settings, including age, height, and personality, allowing you to select a character that matches the world-view of your video or content.
In VOICEVOX, characters are released in groups of several at a time, so they are categorized by release period, such as "1st Generation" or "2nd Generation."
VOICEVOX Nemo, without character settings, also released
"VOICEVOX Nemo," released in November 2023, is a voice library without character settings.
It was developed with use in business scenes and educational settings in mind, featuring a calm voice quality adaptable to a wide range of situations.
Unlike regular VOICEVOX voices with strong character colors, it is suitable for more formal purposes such as corporate presentations, educational content, and official announcements.
VOICEVOX Nemo also has multiple voice qualities available, and you can choose from both male and female voices.
It is possible to choose the most suitable voice according to the content and target audience.
Equipped with emotional expression and customization functions
VOICEVOX provides a function to reflect 8 types of emotional styles in the audio.
Styles such as "Sweet," "Tearful," "Scared," and "Whisper" are available, but the styles that can be used are determined by the character.
Additionally, it is possible to adjust accent, intonation, and length as voice parameters, which can be combined and adjusted.
There are also functions to change the emotion or adjust the reading of specific parts within a sentence, allowing for fine-tuned adjustments in expression.
By pre-registering the reading of proper nouns or technical terms using the accent dictionary function, you can also reduce reading errors.
Singing voice synthesis function also introduced
In January 2024, a singing voice synthesis function was added to VOICEVOX, making it possible to have characters sing songs.
The "Humming" function supports 29 types of characters (as of June 2025).
With the humming function, you can generate audio that sounds like a character singing along to a melody.
The "Song" function currently only supports Namine Ritsu, but it allows for more serious singing voice synthesis.
【Commercial OK!】Recommended AI speech synthesis software you can use for free right now
There is a recommended reading method for those looking for speech synthesis software.
It is the AI speech synthesis web app "Ondoku"!
"Ondoku" is an AI speech synthesis service that can be used for free.
Since it is a web app used from a browser, you can easily read text from any environment, whether it's Windows, Mac, Linux, or a smartphone.
Create realistic and easy-to-understand reading audio with the latest AI
"Ondoku" is a reading service that synthesizes audio with the latest AI.
It can generate clear and realistic audio that sounds as if a real narrator or voice actor is reading it.
There are 16 types of voices available for Japanese reading on "Ondoku".
With male, female, and children's voices, there are voices suitable for a wide range of situations, from business to hobby use.
It can be useful for various purposes such as YouTube, Instagram, and TikTok videos, or in-store announcements.
"Ondoku" also allows you to adjust the pitch of the sound and read conversations using multiple voices!
You can generate the audio you want and create attractive video audio.
Speech synthesis service you can use right now without installation
In order to use VOICEVOX, it is necessary to download the software from the official website and perform installation work.
To install and use the software, knowledge about PCs such as Windows or Mac is required.
Also, when installing VOICEVOX for the first time, you need to download a file of about 1.5GB.
- Using a low-performance or old PC
- Home internet connection is slow
- Can only use tethering
In such cases, it is often difficult to install VOICEVOX.
In such cases, "Ondoku" is recommended!
Using "Ondoku" is very simple.
"Ondoku" can be used immediately if you have a web browser, so no complicated setup work is required at all.
You can create audio immediately just by opening the top page, so even beginners can use it with confidence.

Since no download is required, you can complete generating and downloading audio while you would otherwise still be installing VOICEVOX.
"Ondoku" supports multiple languages. Easily create YouTube videos for foreign audiences
VOICEVOX is a reading software specifically for Japanese.
Therefore, the reading of sentences containing loanwords or English can sometimes become unnatural.
"Ondoku" supports a total of 48 languages, including Japanese, English, Korean, Chinese, Spanish, Vietnamese, and more!
【Ondoku】Listen to voice types and sample audio for supported languages | Text-to-speech software Ondoku
Here we will introduce Ondoku's supported languages and sample audio.
Reading foreign languages is also natural, and it can be utilized for producing multilingual content on YouTube.
In addition, it can be used in various situations, such as foreign language announcements for stores and facilities, multilingual educational content, and presentation materials for global companies.
Convenience during commercial use
When using VOICEVOX for commercial purposes, credit notation is always required.
Therefore, in cases where credit notation is difficult, such as in-store broadcasts, it is common to handle it by reading the credit within the audio, such as "Presented by Kasukabe Tsumugi from VOICEVOX."
However, there are also cases where it is not appropriate to notation credits in official corporate videos or product introduction videos, or where credit notation is difficult due to design constraints.
In such cases, "Ondoku" is recommended!
"Ondoku" is OK for commercial use, and credit notation is not required if you use a paid plan.
It can be freely utilized for any commercial purpose, such as use in companies, product sales, and monetization of YouTube.
Since commercial use is possible with only credit notation even on the free plan, you can first try it for free before considering a paid plan.
Why not experience "Ondoku," which can be used for free first?
With the AI reading service "Ondoku" which can be used for free, there is no need for large file downloads or tedious installation work!
When you want to create audio, you can immediately create reading audio with the latest AI.
Why don't you also experience "Ondoku"'s AI text reading?
Detailed explanation of VOICEVOX installation method 【Windows 11】
Next, we will explain the installation method and usage of VOICEVOX.
In order to use VOICEVOX, it is necessary to download the software from the official website and install it.
First, we will explain the VOICEVOX installation procedure using a Windows PC as an example.
※Explained using Windows 11 24H2.
VOICEVOX download and installation procedure

First, access the VOICEVOX official website and download the installation file.
On the download screen, you can select:
- Windows: GPU version or CPU version
- Mac: Intel version or Apple Silicon version
- Linux: GPU version or CPU version
This time, we will select the Windows version.

Also, for the Windows version, you can choose between the installer version and the ZIP version, but normally it is fine to download the installer version.
※Depending on the security settings of Windows or your web browser, the download may be blocked, in which case you should permit the download and save it.
Once downloaded, start the installation on Windows 11.

Double-click the downloaded file to start the setup wizard.

Click "Next" to start downloading the setup files.

※Since a total of about 1.5GB of files are downloaded, it may take some time depending on your connection speed.


Once the download is finished, the setup wizard will resume.
Select the installation user.

Select the installation destination folder.
Usually, it is installed in the Program Files folder of Windows.

Click "Install" to begin the installation.

Wait a while, and the installation will be completed.

Launching VOICEVOX
Launch VOICEVOX.
If you checked "Run VOICEVOX" at the end of the installation, it will launch automatically after the installation is complete.
You can also launch it from the Windows Start menu or the desktop shortcut.

Initial setup of VOICEVOX
When you launch VOICEVOX for the first time, you need to agree to the terms of use.

If there are no problems, click "Agree and start using."
The introduction screen for additional characters will open, so click "Complete."

The consent screen for collecting software usage data will open.

Click "Allow" or "Deny."
With this, the initial setup is complete.
The operation screen for entering text and reading audio will open.

Basic usage of VOICEVOX
Now VOICEVOX has been successfully installed on your Windows PC.
Next, we will explain basic usage for actually converting text into audio.
How to use basic audio generation functions
When you open the VOICEVOX screen, character icons and a text input field are displayed.

Click the text input field (the part with the green underline).
It will switch to the editing screen.

Enter your text.

Press the play button at the bottom left of the screen to play the audio and check the generated result.
Clicking the "+" on the text input screen allows you to add a new text input field.

Clicking the character icon opens the character selection menu.

By default, "Shikoku Metan" is displayed, but you can change it to your preferred character, such as "Zundamon" or "Kasukabe Tsumugi," by clicking.

By assigning different characters to multiple text lines, you can also create audio in a conversational format.

However, to use this function effectively, it is necessary to understand the differences in voice quality for each character and select appropriate combinations.
In addition, since detailed adjustment work such as conversation tempo and pauses is required, a certain amount of time and experience is needed until you get used to it.
How to use VOICEVOX's audio export function
Next, we will explain how to use the audio export function.
How to use "Export audio separately"
To export audio,
Select "File" → "Audio Export."

Select the export destination folder.

Then, the audio will be exported separately for each line.
How to use "Connect and export audio"
In VOICEVOX, you can also connect audio and export it.
Selecting "File" → "Connect and Export Audio" will open the file saving screen.

Enter the file name and save.

How to use emotional styles and parameter adjustment
As an advanced usage, VOICEVOX also has a function to change expressions using emotional styles.
It varies by character, but emotional styles such as "Normal," "Sweet," "Tsunt-tsun," "Sexy," "Whisper," "Soft Whisper," "Excited," and "Tearful" are available.

By changing the emotional style, you can generate audio with a completely different impression even with the same text, so you can use them according to the content or production of your material.
To change the emotional style, hover your mouse cursor over the ">" on the right side of the character selection menu.
Choices will be displayed, so click to select.
Additionally, you can individually adjust three parameters: accent, intonation, and length.
Accent editing:

Intonation editing:

Length editing:

Each item can be switched at the bottom left of the screen.
Each can be adjusted for every single sound, allowing for reading with more realistic pronunciation.
【Important】About commercial use of VOICEVOX: Confirmation of terms is important

Caution is needed regarding terms when using VOICEVOX for commercial purposes.
It is necessary to correctly understand the terms of use and perform appropriate credit notation.
From here, we will explain important points for using VOICEVOX commercially in the correct way.
Basic commercial use rules
When using audio generated by VOICEVOX, appropriate credit notation is required for both commercial and non-commercial use.
When noting credit, it is necessary to make it clear that VOICEVOX was used and which character was used.
For example, if Zundamon's voice is used, it should be noted as "VOICEVOX: Zundamon," and if Shikoku Metan's voice is used, it should be noted as "VOICEVOX: Shikoku Metan."
When used for videos such as on YouTube, it should be stated in the video summary section or within the video.
When credit means are only audio, such as for phone audio, insert an audio credit like "Using Kasukabe Tsumugi from VOICEVOX" within the audio.
Usage restrictions by VOICEVOX character
Different terms of use are set for each character in VOICEVOX.
Some characters have special restrictions.
While many characters allow commercial use with appropriate credit notation, there are also characters with restrictions on commercial use.
Characters related to the Tohoku Project (Zundamon, Tohoku Kiritan, Tohoku Itako, etc.) are basically available for commercial use, but use in political content or adult-oriented content is prohibited.
For some characters, use on affiliate sites or as the voice of an original character may be prohibited.
If you plan on commercial use, it is important to check the individual terms of use for the character you plan to use in advance.
When using standing illustrations in videos, pay attention to the illustration license as well
Furthermore, when using standing illustrations (tachie) in videos of characters like "Zundamon," pay attention to the terms of use for the illustrations.
Different terms of use from VOICEVOX and the character are set for illustrations.
It is necessary to check the terms of use for the illustrations in advance as well.
Licenses are complex, so prior confirmation is very important
A point that requires particular attention in the commercial use of VOICEVOX is that different license systems exist for each character.
Since usage conditions vary greatly by character, if you use multiple characters, it is necessary to check the terms of use for each individually.
In particular, note that the terms of use differ between Tohoku Project-related characters and other original characters.
Also, when new characters are added in the future, there is a possibility that different terms of use will be set.
When considering use in a company, confirmation by the legal department may be necessary, and it may take time until use can start.
Because of this complex license system, confirming the terms of use can become a major burden if you want to use it for business purposes or commercial use such as YouTube monetization.
Why not try free reading with "Ondoku"?
So far, we have explained the features of VOICEVOX, the installation method on Windows, and how to use it in detail.
However, downloading and installation take time, and the terms of use can be complex and difficult to use.
In such cases, the easy-to-use and multifunctional speech synthesis service "Ondoku" is recommended!
"Ondoku" is an online AI reading service that can be used for free.
Amazingly, you can synthesize up to 5,000 characters for free just by registering your email address!
It can be used from Windows, Mac, Linux, and also from smartphones, and can be used for any purpose such as business, education, and entertainment.
Since commercial use is OK, it also supports YouTube monetization!
Usage is also very simple!
Since it can be used immediately from a browser without installation, you can create high-quality audio as soon as you think of it!
Even if you are currently downloading the VOICEVOX installation file, why not experience "Ondoku" in the meantime?
Multilingual reading possible with high-quality AI voices
The 16 types of Japanese voices in "Ondoku" are rich in variety, including voices for men, women, and children!
Of course, it also supports conversation reading using multiple voices.
"Ondoku" supports about 50 languages, including Japanese, English, Chinese, Korean, Spanish, French, and German.
【Ondoku】Listen to voice types and sample audio for supported languages | Text-to-speech software Ondoku
Here we will introduce Ondoku's supported languages and sample audio.
Since you can produce global content, you can increase YouTube views by targeting the world.
Available right now without installation or download!
Using "Ondoku" is very simple.
Once you open the top page, just enter the text and press the read button!
A natural audio file is generated in seconds.
Since you can use speech synthesis for up to 5,000 characters for free, why not experience "Ondoku" first?
Why don't you create realistic reading audio with the latest AI that is OK for commercial use?
In this article, we explained the features of VOICEVOX, the installation method on Windows, and how to use it in detail.
The greatest charm of VOICEVOX is the ability to create videos using famous characters, starting with "Zundamon."
However, there are also difficult points in usage, such as downloading, installation, and terms of use.
If you want to read text aloud using the latest AI, the web service "Ondoku" is also recommended.
Why don't you also create video audio with "Ondoku", which can be used immediately without downloading?
■ AI voice synthesis software "Ondoku"
"Ondoku" is an online text-to-speech tool that can be used with no initial costs.
- Supports approximately 50 languages, including Japanese, English, Chinese, Korean, Spanish, French, and German
- Available from both PC and smartphone
- Suitable for business, education, entertainment, etc.
- No installation required, can be used immediately from your browser
- Supports reading from images
To use it, simply enter text or upload a file on the site. A natural-sounding audio file will be generated within seconds. You can use voice synthesis up to 5,000 characters for free, so please give it a try.
Email: ondoku3.com@gmail.com
"Ondoku" is a Text-to-Speech service that anyone can use for free without installation. If you register for free, you can get up to 5000 characters for free each month. Register now for free
