Skip to main content

🗣️ Thai Text-to-Speech V2 (Kaitom Voice)

🗣️ ระบบแปลงข้อความภาษาไทยเป็นเสียงพูด - เวอร์ชั่น 2 (เสียงน้องไข่ต้ม)

Version Status New Production

Welcome to Thai Text-to-Speech API V2, featuring the improved Kaitom voice (น้องไข่ต้ม เวอร์ชั่น 2). This version offers enhanced naturalness and better Thai-English mixed language support through a POST-based API.

iApp Text to Speech API V2

Try Demo

Getting Started

  1. Prerequisites

    • An API key from iApp Technology
    • Text input in Thai and/or English
    • Maximum text length: No specific limit
    • Supported output format: WAV
  2. Quick Start

    • Fast processing (less than 1 second)
    • Improved natural speech generation
    • Enhanced Thai-English mixed text support
  3. Key Features

    • Improved natural speech synthesis (V2 engine)
    • Superior mixed language support (Thai-English)
    • Language mode selection (TH or TH_MIX_EN)
    • Emoji support
    • Number, date, and currency value conversion
    • Fast processing time
  4. Security & Compliance

    • GDPR and PDPA compliant
    • No data retention after processing
How to get API Key?

Please visit API Key Management page to view your existing API key or request a new one.

V1 Available

Looking for the legacy GET-based API with Kaitom V1 or Cee voice? See Text-to-Speech V1

API Endpoint

EndpointMethodDescriptionCost
/v3/store/speech/text-to-speech/kaitom
Legacy: /thai-tts-kaitom2/tts
POSTThai TTS with Kaitom voice V21 IC per 400 characters

Quick Example

Sample Request

curl --location 'https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom' \
--header 'apikey: YOUR_API_KEY' \
--form 'text="สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ"' \
--form 'language="TH"'

Sample Response

Audio file output (WAV format). The output audio file can be previewed as below:

Thai Only:

Thai Mixed with English:

API Reference

Text-to-Speech V2 Endpoint

  • Endpoint: POST https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom
  • Required Parameters:
    • apikey: Your API key (header)
    • text: Text to convert to speech (form data)
  • Optional Parameters:
    • language: Language mode (form data)
      • TH: Thai only (pure Thai text)
      • TH_MIX_EN: Thai mixed with English (default)

Code Examples

Python

import requests

url = "https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom"
headers = {"apikey": "YOUR_API_KEY"}
data = {
"text": "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ",
"language": "TH"
}

response = requests.post(url, headers=headers, data=data)
with open("output.wav", "wb") as f:
f.write(response.content)

JavaScript (Node.js)

const axios = require("axios")
const fs = require("fs")
const FormData = require("form-data")

let formData = new FormData()
formData.append("text", "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
formData.append("language", "TH")

let config = {
method: "post",
url: "https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom",
headers: {
apikey: "YOUR_API_KEY",
...formData.getHeaders(),
},
data: formData,
responseType: "arraybuffer",
}

axios(config)
.then((response) => {
fs.writeFileSync("output.wav", response.data)
})
.catch((error) => console.log(error))

PHP

<?php
$curl = curl_init();

curl_setopt_array($curl, array(
CURLOPT_URL => 'https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => '',
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 0,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => 'POST',
CURLOPT_POSTFIELDS => array(
'text' => 'สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ',
'language' => 'TH'
),
CURLOPT_HTTPHEADER => array(
'apikey: YOUR_API_KEY'
),
));

$response = curl_exec($curl);
curl_close($curl);

file_put_contents("output.wav", $response);
?>

Swift

let parameters = [
[
"key": "language",
"value": "TH",
"type": "text"
],
[
"key": "text",
"value": "สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ",
"type": "text"
]] as [[String: Any]]

let boundary = "Boundary-\(UUID().uuidString)"
var body = Data()
var error: Error? = nil
for param in parameters {
if param["disabled"] != nil { continue }
let paramName = param["key"]!
body += Data("--\(boundary)\r\n".utf8)
body += Data("Content-Disposition:form-data; name=\"\(paramName)\"".utf8)
if param["contentType"] != nil {
body += Data("\r\nContent-Type: \(param["contentType"] as! String)".utf8)
}
let paramType = param["type"] as! String
if paramType == "text" {
let paramValue = param["value"] as! String
body += Data("\r\n\r\n\(paramValue)\r\n".utf8)
} else {
let paramSrc = param["src"] as! String
let fileURL = URL(fileURLWithPath: paramSrc)
if let fileContent = try? Data(contentsOf: fileURL) {
body += Data("; filename=\"\(paramSrc)\"\r\n".utf8)
body += Data("Content-Type: \"content-type header\"\r\n".utf8)
body += Data("\r\n".utf8)
body += fileContent
body += Data("\r\n".utf8)
}
}
}
body += Data("--\(boundary)--\r\n".utf8);
let postData = body


var request = URLRequest(url: URL(string: "https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")!,timeoutInterval: Double.infinity)
request.addValue("YOUR_API_KEY", forHTTPHeaderField: "apikey")
request.addValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")

request.httpMethod = "POST"
request.httpBody = postData

let task = URLSession.shared.dataTask(with: request) { data, response, error in
guard let data = data else {
print(String(describing: error))
return
}
print(String(data: data, encoding: .utf8)!)
}

task.resume()

Kotlin

val client = OkHttpClient()
val mediaType = "text/plain".toMediaType()
val body = MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("language","TH")
.addFormDataPart("text","สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
.build()
val request = Request.Builder()
.url("https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")
.post(body)
.addHeader("apikey", "YOUR_API_KEY")
.build()
val response = client.newCall(request).execute()

Java

OkHttpClient client = new OkHttpClient().newBuilder()
.build();
MediaType mediaType = MediaType.parse("text/plain");
RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
.addFormDataPart("language","TH")
.addFormDataPart("text","สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ")
.build();
Request request = new Request.Builder()
.url("https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom")
.method("POST", body)
.addHeader("apikey", "YOUR_API_KEY")
.build();
Response response = client.newCall(request).execute();

Dart

var headers = {
'apikey': 'YOUR_API_KEY'
};
var data = FormData.fromMap({
'language': 'TH',
'text': 'สวัสดีครับ น้องไข่ต้ม มาแล้วฮะ'
});

var dio = Dio();
var response = await dio.request(
'https://uat-api.iapp.co.th/v3/store/speech/text-to-speech/kaitom',
options: Options(
method: 'POST',
headers: headers,
),
data: data,
);

if (response.statusCode == 200) {
print(json.encode(response.data));
}
else {
print(response.statusMessage);
}

Features & Capabilities

Core Features

  • Improved natural speech generation (V2 engine)
  • Superior mixed Thai-English text support
  • Language mode selection
  • Emoji conversion
  • Number and date formatting
  • Fast processing

Language Modes

  • TH (Thai Only): Pure Thai text processing for optimal Thai pronunciation
  • TH_MIX_EN (Default): Mixed Thai-English support for bilingual content

Limitations and Best Practices

Limitations

  • Thai and English language support only
  • Single voice option: Kaitom V2 (male voice)
  • Only speaker_id 0 is supported
  • Output format: WAV only

Best Practices

  • Use proper punctuation for natural pauses
  • Choose appropriate language mode:
    • Use TH for pure Thai text
    • Use TH_MIX_EN for mixed Thai-English content
  • Keep sentences natural and conversational
  • Test with small text segments first

Accuracy & Performance

Overall Accuracy

  • Enhanced natural speech quality (V2 engine)
  • Improved pronunciation for both Thai and English
  • Better handling of mixed language content
  • Proper handling of numbers and special characters

Processing Speed

  • Less than 1 second per request
  • Consistent performance regardless of text length

What's New in V2

Improvements over V1

  • ✨ Enhanced speech naturalness
  • 🗣️ Better Thai-English mixed language support
  • 🎵 Improved pronunciation accuracy
  • ⚡ Same fast processing speed
  • 📊 Language mode selection for optimized output

Migration from V1

If you're using V1, here are the key changes:

AspectV1 (GET)V2 (POST)
MethodGETPOST
Text ParameterQuery parameterForm data
Language ModeNot availableTH or TH_MIX_EN
Output FormatMP3WAV
Endpoint/kaitom/v1/kaitom

Pricing

AI API Service NameEndpointIC Per CharactersOn-Premise
Thai Text To Speech V2 (TTS)iapp_text_to_speech_v2_kaitom1 IC/400 CharactersContact

See Also