Thai Voice Cloning is Here — Clone Any Thai Voice in 10 Seconds

April 28, 2026 · 5 min read

CEO @ iApp Technology

Thai Voice Cloning - Kaitom Voice V3

We're excited to announce Thai Voice Cloning, the newest capability inside our Kaitom Voice V3 TTS API. Hand the model a clean 8–12 second Thai voice clip plus its literal transcript, and the API will speak any Thai text you want — in that exact voice.

This is the first production-grade Thai voice cloning API built specifically for the Thai language.

Why Thai Voice Cloning Matters

General-purpose voice cloning tools have been around for a while in English, but Thai has always been a stepchild — wrong tones, mispronounced final consonants, robotic prosody, and complete inability to handle Thai-English code mixing.

Kaitom Voice V3's cloning endpoint was trained on Thai natively. That means:

Correct Thai tones (ไม้เอก, โท, ตรี, จัตวา) preserved per the speaker.
Final consonants pronounced naturally instead of dropped.
Numbers, dates, currency read out in Thai conventions automatically.
Mixed Thai–English content handled in a single utterance.

You get a voice that actually sounds like the person you cloned — not a robot wearing their hat.

How It Works

Thai Voice Cloning Pipeline by iApp Technology

The reference clip captures the speaker's timbre and prosody; the V3 engine applies that to whatever Thai text you provide.

Try It in 60 Seconds

The fastest path is the in-browser demo — record yourself, type Thai text, hear it back in your own voice:

👉 Open the interactive demo

API Quick Start

cURL

curl -X POST 'https://api.iapp.co.th/v3/store/audio/tts/clone' \
    --header 'apikey: YOUR_API_KEY' \
    --form 'text=สวัสดีครับ วันนี้ทดสอบการโคลนเสียงด้วย AI' \
    --form 'speed=1.0' \
    --form 'ref_text=ฮัลโหล สวัสดีครับ ผมชื่อไข่ต้ม' \
    --form 'ref_audio=@reference.wav' \
    --output 'output.pcm'

Python

import requests

url = "https://api.iapp.co.th/v3/store/audio/tts/clone"
headers = {"apikey": "YOUR_API_KEY"}

with open("reference.wav", "rb") as ref:
    files = {"ref_audio": ref}
    data = {
        "text": "สวัสดีครับ วันนี้ทดสอบการโคลนเสียงด้วย AI",
        "ref_text": "ฮัลโหล สวัสดีครับ ผมชื่อไข่ต้ม",
        "speed": "1.0",
    }
    r = requests.post(url, headers=headers, data=data, files=files)

with open("output.pcm", "wb") as f:
    f.write(r.content)

The response is raw signed 16-bit little-endian PCM, mono, 24 kHz streamed as bytes arrive. Wrap it in a 44-byte WAV header on the client, or convert with ffmpeg:

ffmpeg -f s16le -ar 24000 -ac 1 -i output.pcm output.wav

Request Fields

Field	Type	Required	Notes
`text`	string	yes	Thai text to synthesize
`ref_text`	string	yes	Literal transcript of `ref_audio` (word-for-word, not a description)
`ref_audio`	file	yes	WAV or MP3, 8–12 s of clean mono Thai speech
`speed`	float	no	`0.8`–`1.2`, default `1.0`

Tips for a Great Clone

The clone quality is bounded by the reference clip. A few minutes of curating pays off:

Record clean audio. No background music, no traffic noise, no overlapping voices.
Use a single speaker. Don't include "uh", coughs, or interjections from someone else.
Match the transcript exactly. ref_text must say what ref_audio says, character-for-character. Mismatches cause prosody drift and speed-up artifacts.
Stay in the 8–12 s window. Shorter clips lose timbre; clips longer than 15 s are silently trimmed and the trailing portion of ref_text becomes garbage.
Use one consistent speaking style. If the reference is calm and the target text is shouted, expect ambiguous results.

What You Can Build

Personalized Audiobooks

Authors can narrate their own books in Thai, then clone the voice to extend coverage to chapters they didn't have time to record.

Brand Voice at Scale

Record a 10-second sample of your brand voice talent once, then synthesize unlimited Thai marketing videos, IVR prompts, and product demos in that exact voice.

Voice Preservation

Help patients with degenerative speech conditions preserve their voice — record a sample now, retain the ability to "speak" later.

Localization for Thai Markets

Bring international content to Thai audiences with the same speaker the audience already recognizes. No casting calls, no studio days.

Accessibility & E-Learning

Generate audio versions of educational content using a teacher's own voice, complete with correct Thai number, date, and currency reading.

Pricing

Voice cloning is priced at 1 IC per 400 chars.

Endpoint	Method	Cost
`/v3/store/audio/tts` (default Kaitom voice)	POST	1 IC per 400 chars
`/v3/store/audio/tts/clone` (Thai voice cloning)	POST	1 IC per 400 chars

Responsible Use

Voice cloning is powerful and easy to misuse. iApp Technology requires that you:

Have explicit consent from the speaker before cloning their voice.
Disclose synthetic audio in contexts where authenticity matters (news, public statements, customer service identity).
Do not impersonate real people for fraud, harassment, or misinformation.

Misuse violates our Terms of Service and Thai law (PDPA, Computer Crime Act). We log all clone requests and cooperate with law enforcement on reported abuse.

Limitations During Alpha

Thai language only — English and Chinese cloning are on the roadmap.
Clone requests are processed serially per server. Expect queued latency under heavy concurrent load.
Maximum reference length: 15 s (longer clips are trimmed).
Output is PCM (raw); wrap with WAV header on the client to play.

Get Started Today

Try the interactive demo — record and clone in your browser
Read the full API docs — complete technical reference
Get an API key — start building immediately

Feedback

This is alpha — your feedback shapes what GA looks like.

Discord: discord.gg/kYcpmdEcS2
Email: sale@iapp.co.th
Phone: 086-322-5858

Thai Voice Cloning is part of Kaitom Voice V3 and is available now to all iApp API users.

Thai Voice Cloning is Here — Clone Any Thai Voice in 10 Seconds

Why Thai Voice Cloning Matters

How It Works

Try It in 60 Seconds

API Quick Start

cURL

Python

Request Fields

Tips for a Great Clone

What You Can Build

Personalized Audiobooks

Brand Voice at Scale

Voice Preservation

Localization for Thai Markets

Accessibility & E-Learning

Pricing

Responsible Use

Limitations During Alpha

Get Started Today

Feedback

ChindaX

SpeechFlow

ChindaGO

Why Thai Voice Cloning Matters​

How It Works​

Try It in 60 Seconds​

API Quick Start​

cURL​

Python​

Request Fields​

Tips for a Great Clone​

What You Can Build​

Personalized Audiobooks​

Brand Voice at Scale​

Voice Preservation​

Localization for Thai Markets​

Accessibility & E-Learning​

Pricing​

Responsible Use​

Limitations During Alpha​

Get Started Today​

Feedback​

Why Thai Voice Cloning Matters

How It Works

Try It in 60 Seconds

API Quick Start

cURL

Python

Request Fields

Tips for a Great Clone

What You Can Build

Personalized Audiobooks

Brand Voice at Scale

Voice Preservation

Localization for Thai Markets

Accessibility & E-Learning

Pricing

Responsible Use

Limitations During Alpha

Get Started Today

Feedback