🗣️ Thai Text-to-Speech V2 (Kaitom Voice)
🗣️ ระบบแปลงข้อความภาษาไทยเป็นเสียงพูด - เวอร์ชั่น 2 (เสียงน้องไข่ต้ม)
Welcome to Thai Text-to-Speech API V2, featuring the improved Kaitom voice (น้องไข่ต้ม เวอร์ชั่น 2). This version offers enhanced naturalness and better Thai-English mixed language support through a POST-based API.
Try Demo
Getting Started
-
Prerequisites
- An API key from iApp Technology
- Text input in Thai and/or English
- Maximum text length: No specific limit
- Supported output format: WAV
-
Quick Start
- Fast processing (less than 1 second)
- Improved natural speech generation
- Enhanced Thai-English mixed text support
-
Key Features
- Improved natural speech synthesis (V2 engine)
- Superior mixed language support (Thai-English)
- Language mode selection (TH or TH_MIX_EN)
- Emoji support
- Number, date, and currency value conversion
- Fast processing time
-
Security & Compliance
- GDPR and PDPA compliant
- No data retention after processing
How to get API Key?
Please visit API Key Management page to view your existing API key or request a new one.
V1 Available
Looking for the legacy GET-based API with Kaitom V1 or Cee voice? See Text-to-Speech V1
API Endpoint
| Endpoint | Method | Description | Cost |
|---|---|---|---|
/v3/store/speech/text-to-speech/kaitomLegacy: /thai-tts-kaitom2/tts | POST | Thai TTS with Kaitom voice V2 | 1 IC per 400 characters |
Quick Example
Sample Request
curl --location 'https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom' \
--header 'apikey: YOUR_API_KEY' \
--form 'text="สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ"' \
--form 'language="TH"'
Sample Response
Audio file output (WAV format). The output audio file can be previewed as below:
Thai Only:
Thai Mixed with English:
API Reference
Text-to-Speech V2 Endpoint
- Endpoint:
POSThttps://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom - Required Parameters:
apikey: Your API key (header)text: Text to convert to speech (form data)
- Optional Parameters:
language: Language mode (form data)TH: Thai only (pure Thai text)TH_MIX_EN: Thai mixed with English (default)
Code Examples
Python
import requests
url = "https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom"
headers = {"apikey": "YOUR_API_KEY"}
data = {
"text": "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ",
"language": "TH"
}
response = requests.post(url, headers=headers, data=data)
with open("output.wav", "wb") as f:
f.write(response.content)
JavaScript (Node.js)
const axios = require("axios")
const fs = require("fs")
const FormData = require("form-data")
let formData = new FormData()
formData.append("text", "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
formData.append("language", "TH")
let config = {
method: "post",
url: "https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom",
headers: {
apikey: "YOUR_API_KEY",
...formData.getHeaders(),
},
data: formData,
responseType: "arraybuffer",
}
axios(config)
.then((response) => {
fs.writeFileSync("output.wav", response.data)
})
.catch((error) => console.log(error))
PHP
<?php
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => 'https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => '',
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 0,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => 'POST',
CURLOPT_POSTFIELDS => array(
'text' => 'สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ',
'language' => 'TH'
),
CURLOPT_HTTPHEADER => array(
'apikey: YOUR_API_KEY'
),
));
$response = curl_exec($curl);
curl_close($curl);
file_put_contents("output.wav", $response);
?>
Swift
let parameters = [
[
"key": "language",
"value": "TH",
"type": "text"
],
[
"key": "text",
"value": "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ",
"type": "text"
]] as [[String: Any]]
let boundary = "Boundary-\(UUID().uuidString)"
var body = Data()
var error: Error? = nil
for param in parameters {
if param["disabled"] != nil { continue }
let paramName = param["key"]!
body += Data("--\(boundary)\r\n".utf8)
body += Data("Content-Disposition:form-data; name=\"\(paramName)\"".utf8)
if param["contentType"] != nil {
body += Data("\r\nContent-Type: \(param["contentType"] as! String)".utf8)
}
let paramType = param["type"] as! String
if paramType == "text" {
let paramValue = param["value"] as! String
body += Data("\r\n\r\n\(paramValue)\r\n".utf8)
} else {
let paramSrc = param["src"] as! String
let fileURL = URL(fileURLWithPath: paramSrc)
if let fileContent = try? Data(contentsOf: fileURL) {
body += Data("; filename=\"\(paramSrc)\"\r\n".utf8)
body += Data("Content-Type: \"content-type header\"\r\n".utf8)
body += Data("\r\n".utf8)
body += fileContent
body += Data("\r\n".utf8)
}
}
}
body += Data("--\(boundary)--\r\n".utf8);
let postData = body
var request = URLRequest(url: URL(string: "https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")!,timeoutInterval: Double.infinity)
request.addValue("YOUR_API_KEY", forHTTPHeaderField: "apikey")
request.addValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")
request.httpMethod = "POST"
request.httpBody = postData
let task = URLSession.shared.dataTask(with: request) { data, response, error in
guard let data = data else {
print(String(describing: error))
return
}
print(String(data: data, encoding: .utf8)!)
}
task.resume()
Kotlin
val client = OkHttpClient()
val mediaType = "text/plain".toMediaType()
val body = MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("language","TH")
.addFormDataPart("text","สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
.build()
val request = Request.Builder()
.url("https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")
.post(body)
.addHeader("apikey", "YOUR_API_KEY")
.build()
val response = client.newCall(request).execute()
Java
OkHttpClient client = new OkHttpClient().newBuilder()
.build();
MediaType mediaType = MediaType.parse("text/plain");
RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("language","TH")
.addFormDataPart("text","สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
.build();
Request request = new Request.Builder()
.url("https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")
.method("POST", body)
.addHeader("apikey", "YOUR_API_KEY")
.build();
Response response = client.newCall(request).execute();
Dart
var headers = {
'apikey': 'YOUR_API_KEY'
};
var data = FormData.fromMap({
'language': 'TH',
'text': 'สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ'
});
var dio = Dio();
var response = await dio.request(
'https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom',
options: Options(
method: 'POST',
headers: headers,
),
data: data,
);
if (response.statusCode == 200) {
print(json.encode(response.data));
}
else {
print(response.statusMessage);
}
Features & Capabilities
Core Features
- Improved natural speech generation (V2 engine)
- Superior mixed Thai-English text support
- Language mode selection
- Emoji conversion
- Number and date formatting
- Fast processing
Language Modes
- TH (Thai Only): Pure Thai text processing for optimal Thai pronunciation
- TH_MIX_EN (Default): Mixed Thai-English support for bilingual content
Limitations and Best Practices
Limitations
- Thai and English language support only
- Single voice option: Kaitom V2 (male voice)
- Only speaker_id 0 is supported
- Output format: WAV only
Best Practices
- Use proper punctuation for natural pauses
- Choose appropriate language mode:
- Use
THfor pure Thai text - Use
TH_MIX_ENfor mixed Thai-English content
- Use
- Keep sentences natural and conversational
- Test with small text segments first
Accuracy & Performance
Overall Accuracy
- Enhanced natural speech quality (V2 engine)
- Improved pronunciation for both Thai and English
- Better handling of mixed language content
- Proper handling of numbers and special characters
Processing Speed
- Less than 1 second per request
- Consistent performance regardless of text length
What's New in V2
Improvements over V1
- ✨ Enhanced speech naturalness
- 🗣️ Better Thai-English mixed language support
- 🎵 Improved pronunciation accuracy
- ⚡ Same fast processing speed
- 📊 Language mode selection for optimized output
Migration from V1
If you're using V1, here are the key changes:
| Aspect | V1 (GET) | V2 (POST) |
|---|---|---|
| Method | GET | POST |
| Text Parameter | Query parameter | Form data |
| Language Mode | Not available | TH or TH_MIX_EN |
| Output Format | MP3 | WAV |
| Endpoint | /kaitom/v1 | /kaitom |
Pricing
| AI API Service Name | Endpoint | IC Per Characters | On-Premise |
|---|---|---|---|
| Thai Text To Speech V2 (TTS) | iapp_text_to_speech_v2_kaitom | 1 IC/400 Characters | Contact |
See Also
- Text-to-Speech V1 - Legacy GET-based API with Kaitom V1 and Cee voice
- Speech-to-Text - Convert speech to text