Skip to main content

🗣️ Thai Text-to-Speech V2 (Kaitom Voice)

1 ICper 400 characters
v2.0 Active🎙️ Speech🆕 March 2025

Welcome to Thai Text-to-Speech API V2, featuring the improved Kaitom voice (น้องไข่ต้ม เวอร์ชั่น 2). This version offers enhanced naturalness and better Thai-English mixed language support through a POST-based API.

iApp Text to Speech API V2

Try Demo

Try Our AI Demo

Login or create a free account to use this AI service demo and explore our powerful APIs.

Get 100 Free Credits (IC) when you sign up!

Offer ends December 31, 2025

Getting Started

  1. Prerequisites

    • An API key from iApp Technology
    • Text input in Thai and/or English
    • Maximum text length: No specific limit
    • Supported output format: WAV
  2. Quick Start

    • Fast processing (less than 1 second)
    • Improved natural speech generation
    • Enhanced Thai-English mixed text support
  3. Key Features

    • Improved natural speech synthesis (V2 engine)
    • Superior mixed language support (Thai-English)
    • Language mode selection (TH or TH_MIX_EN)
    • Emoji support
    • Number, date, and currency value conversion
    • Fast processing time
  4. Security & Compliance

    • GDPR and PDPA compliant
    • No data retention after processing
How to get API Key?

Please visit API Key Management page to view your existing API key or request a new one.

V1 Available

Looking for the legacy GET-based API with Kaitom V1 or Cee voice? See Text-to-Speech V1

API Endpoint

EndpointMethodDescriptionCost
/v3/store/speech/text-to-speech/kaitom
Legacy: /thai-tts-kaitom2/tts
POSTThai TTS with Kaitom voice V21 IC per 400 characters

Quick Example

Sample Request

curl --location 'https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom' \
--header 'apikey: YOUR_API_KEY' \
--form 'text="สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ"' \
--form 'language="TH"'

Sample Response

Audio file output (WAV format). The output audio file can be previewed as below:

Thai Only:

Thai Mixed with English:

API Reference

Text-to-Speech V2 Endpoint

  • Endpoint: POST https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom
  • Required Parameters:
    • apikey: Your API key (header)
    • text: Text to convert to speech (form data)
  • Optional Parameters:
    • language: Language mode (form data)
      • TH: Thai only (pure Thai text)
      • TH_MIX_EN: Thai mixed with English (default)

Code Examples

Python

import requests

url = "https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom"
headers = {"apikey": "YOUR_API_KEY"}
data = {
"text": "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ",
"language": "TH"
}

response = requests.post(url, headers=headers, data=data)
with open("output.wav", "wb") as f:
f.write(response.content)

JavaScript (Node.js)

const axios = require("axios")
const fs = require("fs")
const FormData = require("form-data")

let formData = new FormData()
formData.append("text", "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
formData.append("language", "TH")

let config = {
method: "post",
url: "https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom",
headers: {
apikey: "YOUR_API_KEY",
...formData.getHeaders(),
},
data: formData,
responseType: "arraybuffer",
}

axios(config)
.then((response) => {
fs.writeFileSync("output.wav", response.data)
})
.catch((error) => console.log(error))

PHP

<?php
$curl = curl_init();

curl_setopt_array($curl, array(
CURLOPT_URL => 'https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => '',
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 0,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => 'POST',
CURLOPT_POSTFIELDS => array(
'text' => 'สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ',
'language' => 'TH'
),
CURLOPT_HTTPHEADER => array(
'apikey: YOUR_API_KEY'
),
));

$response = curl_exec($curl);
curl_close($curl);

file_put_contents("output.wav", $response);
?>

Swift

let parameters = [
[
"key": "language",
"value": "TH",
"type": "text"
],
[
"key": "text",
"value": "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ",
"type": "text"
]] as [[String: Any]]

let boundary = "Boundary-\(UUID().uuidString)"
var body = Data()
var error: Error? = nil
for param in parameters {
if param["disabled"] != nil { continue }
let paramName = param["key"]!
body += Data("--\(boundary)\r\n".utf8)
body += Data("Content-Disposition:form-data; name=\"\(paramName)\"".utf8)
if param["contentType"] != nil {
body += Data("\r\nContent-Type: \(param["contentType"] as! String)".utf8)
}
let paramType = param["type"] as! String
if paramType == "text" {
let paramValue = param["value"] as! String
body += Data("\r\n\r\n\(paramValue)\r\n".utf8)
} else {
let paramSrc = param["src"] as! String
let fileURL = URL(fileURLWithPath: paramSrc)
if let fileContent = try? Data(contentsOf: fileURL) {
body += Data("; filename=\"\(paramSrc)\"\r\n".utf8)
body += Data("Content-Type: \"content-type header\"\r\n".utf8)
body += Data("\r\n".utf8)
body += fileContent
body += Data("\r\n".utf8)
}
}
}
body += Data("--\(boundary)--\r\n".utf8);
let postData = body


var request = URLRequest(url: URL(string: "https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")!,timeoutInterval: Double.infinity)
request.addValue("YOUR_API_KEY", forHTTPHeaderField: "apikey")
request.addValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")

request.httpMethod = "POST"
request.httpBody = postData

let task = URLSession.shared.dataTask(with: request) { data, response, error in
guard let data = data else {
print(String(describing: error))
return
}
print(String(data: data, encoding: .utf8)!)
}

task.resume()

Kotlin

val client = OkHttpClient()
val mediaType = "text/plain".toMediaType()
val body = MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("language","TH")
.addFormDataPart("text","สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
.build()
val request = Request.Builder()
.url("https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")
.post(body)
.addHeader("apikey", "YOUR_API_KEY")
.build()
val response = client.newCall(request).execute()

Java

OkHttpClient client = new OkHttpClient().newBuilder()
.build();
MediaType mediaType = MediaType.parse("text/plain");
RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("language","TH")
.addFormDataPart("text","สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
.build();
Request request = new Request.Builder()
.url("https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")
.method("POST", body)
.addHeader("apikey", "YOUR_API_KEY")
.build();
Response response = client.newCall(request).execute();

Dart

var headers = {
'apikey': 'YOUR_API_KEY'
};
var data = FormData.fromMap({
'language': 'TH',
'text': 'สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ'
});

var dio = Dio();
var response = await dio.request(
'https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom',
options: Options(
method: 'POST',
headers: headers,
),
data: data,
);

if (response.statusCode == 200) {
print(json.encode(response.data));
}
else {
print(response.statusMessage);
}

Features & Capabilities

Core Features

  • Improved natural speech generation (V2 engine)
  • Superior mixed Thai-English text support
  • Language mode selection
  • Emoji conversion
  • Number and date formatting
  • Fast processing

Language Modes

  • TH (Thai Only): Pure Thai text processing for optimal Thai pronunciation
  • TH_MIX_EN (Default): Mixed Thai-English support for bilingual content

Limitations and Best Practices

Limitations

  • Thai and English language support only
  • Single voice option: Kaitom V2 (male voice)
  • Only speaker_id 0 is supported
  • Output format: WAV only

Best Practices

  • Use proper punctuation for natural pauses
  • Choose appropriate language mode:
    • Use TH for pure Thai text
    • Use TH_MIX_EN for mixed Thai-English content
  • Keep sentences natural and conversational
  • Test with small text segments first

Accuracy & Performance

Overall Accuracy

  • Enhanced natural speech quality (V2 engine)
  • Improved pronunciation for both Thai and English
  • Better handling of mixed language content
  • Proper handling of numbers and special characters

Processing Speed

  • Less than 1 second per request
  • Consistent performance regardless of text length

What's New in V2

Improvements over V1

  • ✨ Enhanced speech naturalness
  • 🗣️ Better Thai-English mixed language support
  • 🎵 Improved pronunciation accuracy
  • ⚡ Same fast processing speed
  • 📊 Language mode selection for optimized output

Migration from V1

If you're using V1, here are the key changes:

AspectV1 (GET)V2 (POST)
MethodGETPOST
Text ParameterQuery parameterForm data
Language ModeNot availableTH or TH_MIX_EN
Output FormatMP3WAV
Endpoint/kaitom/v1/kaitom

Pricing

AI API Service NameEndpointIC Per CharactersOn-Premise
Thai Text To Speech V2 (TTS)iapp_text_to_speech_v2_kaitom1 IC/400 CharactersContact

See Also