How to use SSML tags with Ondoku's multilingual text-to-speech? How to use the <lang> tag for multilingual audio

Dec. 29, 2025

How to use SSML tags with Ondoku's multilingual text-to-speech? How to use the <lang> tag for multilingual audio


The multilingual reading feature isn't working well for me~
cat

Are you having trouble reading multilingual text, such as Japanese and English, effectively?

The multilingual feature of Ondoku is a convenient function that allows you to read aloud multiple languages using a single type of voice (speaker).

Since it can read in the same voice even when the language changes, you can synthesize speech in multiple languages without any sense of incongruity.

However, you might encounter issues such as:

  • Not being able to read sentences that use many different languages well
  • English pronunciation sounding like Katakana

Sometimes the AI gets confused when determining the language and fails to read it correctly...

Don't worry!

In such cases, you can just use "SSML tags"!

By using SSML tags, you can specify which part of the sentence should be read in which language, allowing for perfect switching.

In this article, we will explain how to solve the problem of "pronunciation not working well" in the Ondoku multilingual reading feature using SSML tags!

What can you do with Multilingual Reading × SSML tags?

What can you do with Multilingual Reading × SSML?

If you are using the multilingual feature (multilingual reading function) of Ondoku, we highly recommend utilizing SSML tags!

By combining Ondoku's multilingual reading feature with SSML tags, expressions that were previously impossible become achievable.

The biggest advantage is that you can freely mix and read multiple languages within a single text.

Ondoku's multilingual reading feature can handle sentences like this:

Please listen to the following English sentence. My name is Yuki and I'm a high school student.

If the sentence is simple enough to distinguish between the Japanese and English parts, the AI can identify the languages and read them correctly.

However, it may not read complex sentences well.

For example, if many languages are used as in this example, the AI may not be able to identify the languages correctly.

Example using many languages:

Japanese "Konnichiwa" is "Hello" in English, "Bonjour" in French, "Guten Tag" in German, and "你好" in Chinese.

Also, when reading a sentence where English is mixed into Japanese, the English part might end up with Katakana pronunciation.

Example of Katakana pronunciation:

Banana is pronounced as "banana" in English.

If it doesn't read well, you could read each language separately and connect the audio files, but the editing work is difficult.

But it's okay!

In such cases, you can read smoothly by using SSML tags!

When reading the same sentences using SSML tags, both sentences can be read with the correct pronunciation like this:


Japanese "Konnichiwa" is Hello in English,
Bonjour in French,
Guten Tag in German,
and 你好 in Chinese.


Banana is pronounced as banana in English.

By giving instructions to the AI with SSML tags, such as "from here in English" or "from here in French," you can create audio with perfect intonation, eliminating the need for editing work to join separate audio files.

Now, let's look at specifically how to use SSML!

Basic Writing of the SSML <lang> Tag: How to Specify Languages

Basic Writing of the SSML <lang> Tag: How to Specify Languages

Using SSML tags is very simple.

Just sandwich the text you want to specify between "tags."

How to Write the SSML <lang> Tag to Specify a Language

To specify a language using SSML tags, first, wrap the entire text in <speak> tags.


The text you want to read here

Next, wrap the part where you want to specify the language in <lang> tags.


The text you want to read here

As a specific example, if you want the word "Hello" to be read in American English, you would write it as follows:


Hello

By writing it this way, the AI understands, "I should read this part with an American English pronunciation."

If you want to switch languages in the middle of a sentence, just insert this tag at the point where you want to read in a different language.

What is a Language Code? Basic Knowledge for Switching Multiple Languages

The parts like en-US or ja-JP used inside the tags are called "Language Codes."

They consist of a combination of "language" and "region." For English, American English is "en-US" and British English is "en-GB."

By using different codes even for the same language, you can accurately specify the accents and pronunciations unique to each country.

The main language codes are as follows:

Language Language Code
Japanese ja-JP
English (US) en-US
English (UK) en-GB
French fr-FR
German de-DE
Spanish es-ES
Italian it-IT
Russian ru-RU
Chinese (Simplified) zh-CN
Korean ko-KR

Once you get used to it, you can try various languages by just changing this code part.

However, to prevent mistakes in the beginning, we recommend copying and using the templates introduced next.

[Copy & Paste OK] 10 Popular Languages! SSML Template List

We have compiled tags for major languages used around the world.

You can easily read in multiple languages by simply copying the SSML tags from this table and pasting them into the Ondoku text box!

Language SSML Tag for Copying
Japanese Text here
English (US) Text here
English (UK) Text here
French Text here
German Text here
Spanish Text here
Italian Text here
Russian Text here
Chinese (Simplified) Text here
Korean Text here

Since English codes are separated into US and UK, you can clearly express the differences in accents.

The tags won't work if even a single character is wrong, so we recommend copying them from this table!

Practical! Multilingual Reading Examples & Usage by Scene

Now that you know how to write SSML tags, let's look at specific examples and usage to see how they are useful in actual scenes!

[Free] Explaining How to Create Multilingual Audio with Ondoku

To create multilingual audio with Ondoku, first open the Ondoku top page.

Ondoku

First, enter the text in the text box.

This time, we will use the example sentence introduced at the beginning, which is difficult to read with the multilingual feature alone.

It uses Japanese, English, French, German, and Chinese.

Japanese "Konnichiwa" is "Hello" in English, "Bonjour" in French, "Guten Tag" in German, and "你好" in Chinese.

Next, add SSML tags to the text.

In a text like this where:

  • The main language is Japanese
  • Foreign languages are used only in parts

For sentences where the main language is clearly identifiable, you only need to add SSML tags to the parts where other languages are used.

(We also explain what to do if this method doesn't work later in this article. Please see here)

This time, we will enter SSML tags for the four languages other than Japanese:

  • English: Text here
  • French: Text here
  • German: Text here
  • Chinese: Text here

It will look like this when the SSML tags are entered:


Japanese "Konnichiwa" is Hello in English,
Bonjour in French,
Guten Tag in German,
and 你好 in Chinese.

*You can also enter the SSML tags in a text editor like Notepad beforehand and then copy and paste them.

When you enter this into the text box, it looks like this:

Enter into text box

Using Generative AI Services to Enter SSML Tags is Also Recommended

dog
But entering so many tags is too much work!

Don't worry!

By using generative AI services such as ChatGPT, Gemini, or Claude, you can easily enter SSML tags!

The method for adding SSML tags using generative AI services is very simple.

Please add SSML lang tags for each language.

(Text you want to read here)

By giving instructions like this, you can automatically insert SSML tags throughout the text.

Insert SSML tags with Gemini

If you want to make corrections, such as "I want to read in British English instead of American English," you can say:

Please change American English to British English.

The AI will immediately correct <lang xml:lang="en-US"> to <lang xml:lang="en-GB">.

Points for Choosing a Multilingual Reading Voice

Select "Multilingual" from the language options.

Select Multilingual from languages

Next, choose the voice (speaker).

Select voice

Since this is a sentence where foreign languages are included within Japanese, I chose the Japanese voice "Masaru(ja)".

You can listen to samples of multilingual AI voices in this article.

Please take a look.

Now the preparation for reading is complete.

Preparation complete

Click "Read" to start the speech synthesis.

Speech synthesis is completed in just a few seconds.

When the reading process is finished, the screen switches and the audio player is displayed.

Process complete

In this way, the languages used in the sentence were automatically identified and read aloud.

This completes the flow of reading multilingual text with Ondoku's multilingual feature!

Click "Download" to save the audio file in MP3 format.

The multilingual feature (multilingual reading function) can be conveniently used in various situations, such as language learning materials, YouTube videos for overseas audiences, and announcement broadcasts for inbound tourists.

Why don't you try creating audio using Ondoku's multilingual feature for free?

How to Add SSML Tags to Sentences Where the Main Language is Hard to Determine

The examples used so far were texts where Japanese was the main language, so we were able to read them well by adding <lang> tags to the other parts like English or French.

However, in texts where it is unclear which language is the main one, such as a vocabulary list for language learning, the AI may not be able to identify the languages correctly.

In such cases, please add <lang> tags to the entire sentence.

For example:

English vocabulary list related to cooking
Kitchen Kitchen
Recipe Recipe
Frying pan Frying pan
Knife Knife
Seasoning Seasoning

If you want to read this text:


料理に関する英単語集
キッチンKitchen
レシピRecipe
フライパンFrying pan
包丁Knife
調味料Seasoning

By adding SSML tags to both the Japanese and English parts like this, you can read it correctly.

You Can Also Adjust the "Pause" in Audio with Ondoku's SSML Feature!

In the "English vocabulary list related to cooking" example introduced above, I actually used not only <lang> tags but also the SSML <break time="1s"/> tag.

This is an SSML tag for adjusting the "pause" in the audio.

By using this tag, you can read text even more naturally with Ondoku.

This article explains how to adjust the pause in audio using SSML tags, so please take a look.

Also, general usage of SSML tags in Ondoku is explained in this article.

Please take a look.

Why Not Try Ondoku's Multilingual Reading Feature?

In this article, we explained SSML tags that can be used in Ondoku's multilingual reading.

Just by using the "lang tag" introduced in this article, the scope of Ondoku's utility expands instantly!

  • Add English phrases to YouTube video narrations
  • Create authentic listening materials for English or other foreign languages
  • Create multilingual announcement broadcasts for stores

Depending on your ideas, you can create various multilingual content.

We hope that Ondoku will be useful for your activities, such as multilingual YouTube videos or broadcasts in stores and facilities!

■ AI voice synthesis software "Ondoku"

"Ondoku" is an online text-to-speech tool that can be used with no initial costs.

  • Supports approximately 50 languages, including Japanese, English, Chinese, Korean, Spanish, French, and German
  • Available from both PC and smartphone
  • Suitable for business, education, entertainment, etc.
  • No installation required, can be used immediately from your browser
  • Supports reading from images

To use it, simply enter text or upload a file on the site. A natural-sounding audio file will be generated within seconds. You can use voice synthesis up to 5,000 characters for free, so please give it a try.

Text-to-speech software "Ondoku" can read out 5000 characters every month with AI voice for free. You can easily download MP3s and commercial use is also possible. If you sign up for free, you can convert up to 5,000 characters per month for free from text to speech. Try Ondoku now.
HP: ondoku3.com
Email: ondoku3.com@gmail.com
Related posts