Skip to main content

🗣️ (Alpha) iApp Text-to-Speech (TTS) 🆕

🗣️ AI-powered text-to-speech synthesis API

Version Status New

Welcome to iApp TTSv3 API, a cutting-edge text-to-speech synthesis service that converts text into natural-sounding speech. Our API uses an advanced AI model to generate audio from text input with excellent accuracy and speed.

Try Demo

Demo key is limited to 10 requests per day per IP
Click here to get your API key

Getting Started

Prerequisites

  • Text input in Thai language
  • Maximum tokens: 1400
  • Output format: WAV

Quick Start

  • Fast processing with GPU acceleration
  • Natural speech generation
  • High-quality speech output

Key Features

  • Natural speech synthesis using state-of-the-art AI
  • Advanced voice quality tuning via parameters
  • High-speed response times
  • Simple REST API interface

API Usage

Endpoints

  • POST /tts - Generate speech from text and download as a file

API Request Examples

Using cURL:

# Health check
curl https://api.iapp.co.th/v3/audio/health

# Generate speech and save to file
curl -X POST https://api.iapp.co.th/v3/audio/tts \
-H "Content-Type: application/json" \
-d '{"text":"Hello, this is a test.","temperature":0.2,"top_p":0.95}' \
--output test.wav

Using Python:

import requests

# Text-to-speech request
response = requests.post(
"https://api.iapp.co.th/v3/audio/tts",
json={
"text": "สวัสดีครับ",
"temperature": 0.2,
"top_p": 0.95,
"max_new_tokens": 1400
}
)

# Save the audio response to a file
with open("output.wav", "wb") as f:
f.write(response.content)

Request Parameters

ParameterTypeDescriptionDefault
textstringText to convert to speechRequired
temperaturefloatGeneration temperature (higher = more random)0.2
top_pfloatTop-p sampling parameter0.95
max_new_tokensintegerMaximum number of tokens to generate1400

Best Practices

  • Use proper punctuation for better speech synthesis
  • Keep sentences natural and conversational
  • For long text, consider breaking it into smaller segments
  • Adjust temperature and top_p parameters to control voice style:
    • Lower temperature (0.1-0.5): More consistent, stable voice
    • Higher temperature (0.6-1.0): More expressive but less predictable