Introducing ChindaTTS — Production Thai-English Text-to-Speech, Now Live

Today we are launching ChindaTTS, our production neural Thai + English text-to-speech engine. It speaks natural Thai with correct prosody, reads numbers, dates, money and code-switched text the way a person would, handles long-form narration cleanly, and can clone a voice from a short consented sample — all behind an API that is drop-in compatible with iApp TTS v3.
👉 Hear every voice and style live at one.iapp.co.th — our new One iApp showcase for Text, Voice and Vision.
Why ChindaTTS
General-purpose TTS has always treated Thai as an afterthought — wrong tones, mangled final consonants, robotic delivery, and a complete inability to mix Thai with English. ChindaTTS was built for Thai from the ground up. The result is speech that sounds like a real Thai speaker, even when the sentence is full of numbers, English loanwords, and brand names.
What it does
- Thai + English + code-switch in one request. The text frontend reads numbers, dates, currency, percentages, phone/ID runs, abbreviations, emails, URLs, ALL-CAPS emphasis, and English tech loanwords naturally.
- Correct central-Thai accent. Pure-English passages automatically switch to a native-English mode, per sentence.
- Long-form, spoken in full — no run-on, no cut-off — up to ~100 seconds per request.
- 3 voices (Kaitom, Kaimook, Kai Daeng) × 8 speaking styles (neutral, friendly, cheerful, calm, serious, sad, excited, empathetic).
- Voice cloning from ~10–20 seconds of consented audio; stateless — used once, never stored.
- Streaming — first audio in under ~1 second; drop-in iApp v3 endpoint.
Hear it for yourself
These are real, unedited ChindaTTS samples. For the full interactive demo — every voice, every style, and live voice cloning — visit one.iapp.co.th.
The three main voices
Kaitom (flagship):
Kai Daeng (male):
Kaimook (female):
Numbers, dates and money — read correctly
Thai + English code-switching
Speaking styles (same engine, different delivery)
Cheerful:
Calm:
Empathetic:
ChindaTTS Prime (optional)
ChindaTTS Prime adds a speech-recognition gate that re-rolls a garbled or cut take — it selects a better take, it does not alter the audio. Same sound; the gain is stability on hard inputs, especially expressive styles.
| Metric — CER (lower is better) | ChindaTTS | ChindaTTS Prime |
|---|---|---|
| Everyday text | 3.2% | 3.2% |
| Numbers / dates / currency | 1.8% | 1.4% |
| Code-switch | 7.9% | 7.6% |
| Expressive styles | 8.3% | 3.9% |
| Bad-take rate (styles) | 15.0% | 6.7% |
| Worst-case p90 (styles) | 31.8% | 11.4% |
Everyday text already sits at the recognizer noise floor; Prime's value shows up on expressive styles, cutting the bad-take rate by more than half.
Speed (per GPU, ~6–7× real-time)
| Request | Audio | Processing |
|---|---|---|
| ~100 chars | ~9 s | ~1.5 s |
| ~300 chars | ~25 s | ~4 s |
| ~1000 chars | ~90 s | ~14 s |
Streaming delivers first audio in under ~1 second, and throughput scales by adding GPUs.
Where it fits
Voice assistants and IVR, in-app and announcement narration, accessibility, e-learning, notifications, and any product that needs a natural Thai voice that also handles English without breaking stride.
Spread the word

Try ChindaTTS
- 🔊 Listen to the live showcase: one.iapp.co.th
- 📄 Product page & specs: iapp.co.th/products/chindaTTS
- 📬 Talk to us about a pilot: sale@iapp.co.th
ChindaTTS is production-ready for pilots today — voices, cloning, streaming, the iApp v3 API, and Prime are all live and tested. We can't wait to hear what you build with it.