ChindaTTS

Natural Thai-English Text-to-Speech, now live

Production neural TTS for Thai + English — voice assistants, IVR, announcements and narration. Natural Thai prosody, correct numbers and dates, clean long-form, and voice cloning, behind an API that is drop-in compatible with iApp TTS v3.

🔊 Hear it live at one.iapp.co.th Talk to us about a pilot

What it does

🎙️ 2 natural voices

Kaitom (flagship) and Kaimook (female) — each with a distinct, natural central-Thai delivery.

🎭 8 speaking styles

Neutral, friendly, cheerful, calm, serious, sad, excited and empathetic — pick the tone per request.

🌏 Thai + English code-switch

Mix Thai and English in one sentence. Pure-English passages auto-switch to a native-English mode, per sentence.

🔢 Reads everything correctly

Numbers, dates, currency, percentages, phone/ID runs, abbreviations, emails, URLs and ALL-CAPS emphasis — spoken the way a person would.

🧬 Voice cloning

Clone a voice from ~10–20 seconds of consented audio. Stateless — the sample is used once and never stored.

⚡ Streaming, long-form

First audio in under ~1 second, and full-length narration up to ~100 seconds per request — no run-on, no cut-off.

Hear it for yourself

Real, unedited ChindaTTS samples. For the full interactive demo — every voice, every style, and live voice cloning — visit the showcase at one.iapp.co.th.

The main voices

Kaitom (flagship)

Kaimook (female)

Numbers, dates and money — read correctly

Thai + English code-switching

Speaking styles — same engine, different delivery

Cheerful

Calm

Empathetic

ChindaTTS Prime (optional)

ChindaTTS Prime adds a speech-recognition gate that re-rolls a garbled or cut take — it selects a better take, it does not alter the audio. Same sound; the gain is stability on hard inputs, especially expressive styles.

Metric — CER (lower is better)	ChindaTTS	ChindaTTS Prime
Everyday text	3.2%	3.2%
Numbers / dates / currency	1.8%	1.4%
Code-switch	7.9%	7.6%
Expressive styles	8.3%	3.9%
Bad-take rate (styles)	15.0%	6.7%
Worst-case p90 (styles)	31.8%	11.4%

Everyday text already sits at the recognizer noise floor; Prime's value shows up on expressive styles, cutting the bad-take rate by more than half. Cost: ~+9% time on easy text, +~2 GB VRAM and a second container; it falls back to standard if the recognizer is down.

Speed (per GPU, ~6–7× real-time)

Request	Audio	Processing
~100 chars	~9 s	~1.5 s
~300 chars	~25 s	~4 s
~1000 chars	~90 s	~14 s

Streaming delivers first audio in under ~1 second; throughput scales by adding GPUs.

Run / deployment

One current-gen GPU (~8 GB; ~10 GB with Prime, 12–16 GB recommended), 16 GB RAM, Linux, one container (+1 for Prime). Available as a managed API or on-premise. Production-ready for pilots — voices, cloning, streaming, the iApp v3 API and Prime are all live and tested.

Good to know

Thai + Latin-script only; other scripts (CJK, Arabic) are dropped.
Up to ~1,200 characters (~100 s) per request; longer text → multiple requests or a batch job.
Tone is set via the fixed style menu; rate via the speed parameter.
One stream per GPU; concurrency scales with more GPUs.

Ready to give your product a Thai voice?

Hear every voice and style on the live showcase, then talk to us about a pilot.

🔊 Open the live showcase Contact sales

ChindaTTS

Natural Thai-English Text-to-Speech, now live

What it does

🎙️ 2 natural voices

🎭 8 speaking styles

🌏 Thai + English code-switch

🔢 Reads everything correctly

🧬 Voice cloning

⚡ Streaming, long-form

Hear it for yourself

ChindaTTS Prime (optional)

Speed (per GPU, ~6–7× real-time)

Run / deployment

Good to know

Ready to give your product a Thai voice?

ChindaX

SpeechFlow

ChindaGO

Natural Thai-English Text-to-Speech, now live​

What it does​

🎙️ 2 natural voices

🎭 8 speaking styles

🌏 Thai + English code-switch

🔢 Reads everything correctly

🧬 Voice cloning

⚡ Streaming, long-form

Hear it for yourself​

ChindaTTS Prime (optional)​

Speed (per GPU, ~6–7× real-time)​

Run / deployment​

Good to know​

Ready to give your product a Thai voice?​

Natural Thai-English Text-to-Speech, now live

What it does

Hear it for yourself

ChindaTTS Prime (optional)

Speed (per GPU, ~6–7× real-time)

Run / deployment

Good to know

Ready to give your product a Thai voice?