DeepSeek V4 NEW
Powered by DeepSeek AI — Flash & Pro tiers on a single endpointdeepseek-v4-flash — Input: 0.01 IC / 1K tokens · Output: 0.02 IC / 1K tokens (~10 / 20 THB/1M)
deepseek-v4-pro — Input: 0.20 IC / 1K tokens · Output: 0.40 IC / 1K tokens (~200 / 400 THB/1M)
Welcome to iApp DeepSeek V4 API. A single OpenAI-compatible endpoint serves two model tiers: Flash for fast, everyday tasks at the same price as V3.2, and Pro for the hardest reasoning, coding, and agentic workloads. The tier is chosen per-request via the model field, so you can mix and match in production.
Choosing a Tier
| Tier | Model name | Best for | Price (input / output per 1K tokens) |
|---|---|---|---|
| Flash | deepseek-v4-flash | Chat, RAG, classification, drafting, everyday Thai/English Q&A | 0.01 / 0.02 IC |
| Pro | deepseek-v4-pro | Hard reasoning, multi-step agents, IMO/IOI-level math & coding | 0.20 / 0.40 IC |
Pro costs ~20× Flash. Use Flash by default and only escalate to Pro for the requests where you actually need the deeper thinking.
Try Demo
Try Our AI Demo
Login or create a free account to use this AI service demo and explore our powerful APIs.
Get 100 Free Credits (IC) when you sign up!
Offer ends December 31, 2025
Start a conversation with the AI assistant
Type your message below and press Enter or click SendOverview
DeepSeek V4 is the next generation of DeepSeek's open-source LLM family, building on the sparse-attention and reinforcement-learning advances introduced in V3.2 and adding a stronger reasoning tier. The two model tiers live behind one endpoint:
- Flash is tuned for latency and cost, suitable as a drop-in upgrade for V3.2 chat.
- Pro is tuned for accuracy on hard problems — multi-step math, agentic tool-use, large-scale code refactors.
Key Features
- Two tiers, one endpoint: Pick the tier per-request via the
modelfield. No code change to switch. - OpenAI compatible: Same request/response shape — drop-in replacement for OpenAI clients.
- Streaming support: Real-time token streaming for responsive UX.
- Long context: Extended context window, suitable for long documents.
- Thai-first: First-class Thai support alongside English and Chinese.
Getting Started
-
Prerequisites
- An API key from iApp Technology
- Sufficient IC balance (Pro requests cost ~20× Flash — make sure your top-up reflects that)
-
Quick Start
- Single REST endpoint, OpenAI-compatible format
- Both streaming and non-streaming supported
-
Rate Limits
- 5 requests per second
- 200 requests per minute
Visit API Key Management to view your existing API key or request a new one.
deepseek-v4-flash and deepseek-v4-pro are accepted on this endpoint.Sending any other value in the model field returns HTTP 400. This protects you from accidentally calling a more expensive model by typo.
API Endpoints
| Endpoint | Method | Description | Cost |
|---|---|---|---|
/v3/llm/deepseek-v4/chat/completions | POST | Chat completions for both V4 tiers (streaming & non-streaming) | See pricing per model |
Available Models
| Model | Description |
|---|---|
deepseek-v4-flash | Fast, low-cost everyday tier (recommended default) |
deepseek-v4-pro | Premium reasoning tier for the hardest problems |
Code Examples
cURL — Flash (non-streaming)
curl -X POST 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions' \
-H 'apikey: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "user", "content": "สวัสดีครับ คุณช่วยอะไรได้บ้าง?"}
],
"max_tokens": 4096,
"temperature": 0.7,
"top_p": 0.9
}'
cURL — Pro (streaming)
curl -X POST 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions' \
-H 'apikey: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"model": "deepseek-v4-pro",
"messages": [
{"role": "user", "content": "Prove that the sum of two odd numbers is even."}
],
"max_tokens": 4096,
"temperature": 0.0,
"top_p": 0.9,
"stream": true
}'
Python
import requests
url = "https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions"
payload = {
"model": "deepseek-v4-flash", # use "deepseek-v4-pro" for hard reasoning
"messages": [
{"role": "user", "content": "สวัสดีครับ คุณช่วยอะไรได้บ้าง?"}
],
"max_tokens": 4096,
"temperature": 0.7,
"top_p": 0.9,
}
headers = {
"apikey": "YOUR_API_KEY",
"Content-Type": "application/json",
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
Python — Streaming
import json
import requests
url = "https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions"
payload = {
"model": "deepseek-v4-pro",
"messages": [
{"role": "user", "content": "อธิบายเกี่ยวกับ AI ให้หน่อย"}
],
"max_tokens": 4096,
"temperature": 0.7,
"top_p": 0.9,
"stream": True,
}
headers = {
"apikey": "YOUR_API_KEY",
"Content-Type": "application/json",
}
response = requests.post(url, headers=headers, json=payload, stream=True)
for line in response.iter_lines():
if not line:
continue
line = line.decode("utf-8")
if line.startswith("data: "):
data = line[6:]
if data == "[DONE]":
break
chunk = json.loads(data)
content = chunk["choices"][0]["delta"].get("content", "")
print(content, end="", flush=True)
JavaScript / Node.js
const axios = require('axios');
const url = 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions';
const payload = {
model: 'deepseek-v4-flash', // or 'deepseek-v4-pro'
messages: [
{ role: 'user', content: 'สวัสดีครับ คุณช่วยอะไรได้บ้าง?' }
],
max_tokens: 4096,
temperature: 0.7,
top_p: 0.9,
};
const config = {
headers: {
apikey: 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
};
axios.post(url, payload, config)
.then(response => {
console.log(response.data.choices[0].message.content);
})
.catch(error => {
console.error(error.response?.data || error.message);
});
PHP
<?php
$curl = curl_init();
$payload = json_encode([
'model' => 'deepseek-v4-flash',
'messages' => [
['role' => 'user', 'content' => 'สวัสดีครับ คุณช่วยอะไรได้บ้าง?']
],
'max_tokens' => 4096,
'temperature' => 0.7,
'top_p' => 0.9,
]);
curl_setopt_array($curl, [
CURLOPT_URL => 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
CURLOPT_HTTPHEADER => [
'apikey: YOUR_API_KEY',
'Content-Type: application/json',
],
]);
$response = curl_exec($curl);
curl_close($curl);
$result = json_decode($response, true);
echo $result['choices'][0]['message']['content'];
API Reference
Headers
| Parameter | Type | Required | Description |
|---|---|---|---|
| apikey | String | Yes | Your API key |
| Content-Type | String | Yes | application/json |
Request Body Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | String | Yes | deepseek-v4-flash or deepseek-v4-pro (other values are rejected) |
| messages | Array | Yes | Array of message objects with role and content |
| max_tokens | Integer | No | Maximum tokens to generate (default: 4096) |
| stream | Boolean | No | Enable streaming response (default: false) |
| temperature | Float | No | Sampling temperature 0–2 (default: 0.7) |
| top_p | Float | No | Nucleus sampling (default: 0.9) |
Message Object
| Field | Type | Description |
|---|---|---|
| role | String | system, user, or assistant |
| content | String | The message content |
Recommended Temperature Settings
| Use Case | Temperature |
|---|---|
| Coding / Math | 0.0 |
| Data Analysis | 1.0 |
| General Conversation | 1.3 |
| Translation | 1.3 |
| Creative Writing | 1.5 |
Pricing
| Tier | Model | Endpoint | IC Cost |
|---|---|---|---|
| Flash | deepseek-v4-flash | /v3/llm/deepseek-v4/chat/completions | Input: 0.01 IC / 1K tokens (~10 THB/1M) |
| Pro | deepseek-v4-pro | /v3/llm/deepseek-v4/chat/completions | Input: 0.20 IC / 1K tokens (~200 THB/1M) |
For on-premise deployment of DeepSeek V4, please contact us.
Use Cases
- Flash: chatbots, RAG over Thai documents, classification, structured extraction, customer service
- Pro: multi-step agents, code refactors, mathematical reasoning, hard policy questions, evaluation grading
Support
For support and questions:
- Discord: Join our community
- Email: support@iapp.co.th
- Documentation: Full API Docs