Skip to main content

DeepSeek V4 NEW

DeepSeek LogoPowered by DeepSeek AI — Flash & Pro tiers on a single endpoint
Token-Based Pricing — Two Tiers

deepseek-v4-flash — Input: 0.01 IC / 1K tokens · Output: 0.02 IC / 1K tokens (~10 / 20 THB/1M)

deepseek-v4-pro — Input: 0.20 IC / 1K tokens · Output: 0.40 IC / 1K tokens (~200 / 400 THB/1M)

v4 Active POST /v3/llm/deepseek-v4/chat/completions

Welcome to iApp DeepSeek V4 API. A single OpenAI-compatible endpoint serves two model tiers: Flash for fast, everyday tasks at the same price as V3.2, and Pro for the hardest reasoning, coding, and agentic workloads. The tier is chosen per-request via the model field, so you can mix and match in production.

Choosing a Tier

TierModel nameBest forPrice (input / output per 1K tokens)
Flashdeepseek-v4-flashChat, RAG, classification, drafting, everyday Thai/English Q&A0.01 / 0.02 IC
Prodeepseek-v4-proHard reasoning, multi-step agents, IMO/IOI-level math & coding0.20 / 0.40 IC

Pro costs ~20× Flash. Use Flash by default and only escalate to Pro for the requests where you actually need the deeper thinking.

Try Demo

Try Our AI Demo

Login or create a free account to use this AI service demo and explore our powerful APIs.

Get 100 Free Credits (IC) when you sign up!

Offer ends December 31, 2025

DeepSeek V4 Chat

Start a conversation with the AI assistant

Type your message below and press Enter or click Send

Overview

DeepSeek V4 is the next generation of DeepSeek's open-source LLM family, building on the sparse-attention and reinforcement-learning advances introduced in V3.2 and adding a stronger reasoning tier. The two model tiers live behind one endpoint:

  • Flash is tuned for latency and cost, suitable as a drop-in upgrade for V3.2 chat.
  • Pro is tuned for accuracy on hard problems — multi-step math, agentic tool-use, large-scale code refactors.

Key Features

  • Two tiers, one endpoint: Pick the tier per-request via the model field. No code change to switch.
  • OpenAI compatible: Same request/response shape — drop-in replacement for OpenAI clients.
  • Streaming support: Real-time token streaming for responsive UX.
  • Long context: Extended context window, suitable for long documents.
  • Thai-first: First-class Thai support alongside English and Chinese.

Getting Started

  1. Prerequisites

    • An API key from iApp Technology
    • Sufficient IC balance (Pro requests cost ~20× Flash — make sure your top-up reflects that)
  2. Quick Start

    • Single REST endpoint, OpenAI-compatible format
    • Both streaming and non-streaming supported
  3. Rate Limits

    • 5 requests per second
    • 200 requests per minute
How to get an API Key?

Visit API Key Management to view your existing API key or request a new one.

Only deepseek-v4-flash and deepseek-v4-pro are accepted on this endpoint.

Sending any other value in the model field returns HTTP 400. This protects you from accidentally calling a more expensive model by typo.

API Endpoints

EndpointMethodDescriptionCost
/v3/llm/deepseek-v4/chat/completionsPOSTChat completions for both V4 tiers (streaming & non-streaming)See pricing per model

Available Models

ModelDescription
deepseek-v4-flashFast, low-cost everyday tier (recommended default)
deepseek-v4-proPremium reasoning tier for the hardest problems

Code Examples

cURL — Flash (non-streaming)

curl -X POST 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions' \
-H 'apikey: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "user", "content": "สวัสดีครับ คุณช่วยอะไรได้บ้าง?"}
],
"max_tokens": 4096,
"temperature": 0.7,
"top_p": 0.9
}'

cURL — Pro (streaming)

curl -X POST 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions' \
-H 'apikey: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"model": "deepseek-v4-pro",
"messages": [
{"role": "user", "content": "Prove that the sum of two odd numbers is even."}
],
"max_tokens": 4096,
"temperature": 0.0,
"top_p": 0.9,
"stream": true
}'

Python

import requests

url = "https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions"

payload = {
"model": "deepseek-v4-flash", # use "deepseek-v4-pro" for hard reasoning
"messages": [
{"role": "user", "content": "สวัสดีครับ คุณช่วยอะไรได้บ้าง?"}
],
"max_tokens": 4096,
"temperature": 0.7,
"top_p": 0.9,
}

headers = {
"apikey": "YOUR_API_KEY",
"Content-Type": "application/json",
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())

Python — Streaming

import json
import requests

url = "https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions"

payload = {
"model": "deepseek-v4-pro",
"messages": [
{"role": "user", "content": "อธิบายเกี่ยวกับ AI ให้หน่อย"}
],
"max_tokens": 4096,
"temperature": 0.7,
"top_p": 0.9,
"stream": True,
}

headers = {
"apikey": "YOUR_API_KEY",
"Content-Type": "application/json",
}

response = requests.post(url, headers=headers, json=payload, stream=True)

for line in response.iter_lines():
if not line:
continue
line = line.decode("utf-8")
if line.startswith("data: "):
data = line[6:]
if data == "[DONE]":
break
chunk = json.loads(data)
content = chunk["choices"][0]["delta"].get("content", "")
print(content, end="", flush=True)

JavaScript / Node.js

const axios = require('axios');

const url = 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions';

const payload = {
model: 'deepseek-v4-flash', // or 'deepseek-v4-pro'
messages: [
{ role: 'user', content: 'สวัสดีครับ คุณช่วยอะไรได้บ้าง?' }
],
max_tokens: 4096,
temperature: 0.7,
top_p: 0.9,
};

const config = {
headers: {
apikey: 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
};

axios.post(url, payload, config)
.then(response => {
console.log(response.data.choices[0].message.content);
})
.catch(error => {
console.error(error.response?.data || error.message);
});

PHP

<?php

$curl = curl_init();

$payload = json_encode([
'model' => 'deepseek-v4-flash',
'messages' => [
['role' => 'user', 'content' => 'สวัสดีครับ คุณช่วยอะไรได้บ้าง?']
],
'max_tokens' => 4096,
'temperature' => 0.7,
'top_p' => 0.9,
]);

curl_setopt_array($curl, [
CURLOPT_URL => 'https://api.iapp.co.th/v3/llm/deepseek-v4/chat/completions',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
CURLOPT_HTTPHEADER => [
'apikey: YOUR_API_KEY',
'Content-Type: application/json',
],
]);

$response = curl_exec($curl);
curl_close($curl);

$result = json_decode($response, true);
echo $result['choices'][0]['message']['content'];

API Reference

Headers

ParameterTypeRequiredDescription
apikeyStringYesYour API key
Content-TypeStringYesapplication/json

Request Body Parameters

ParameterTypeRequiredDescription
modelStringYesdeepseek-v4-flash or deepseek-v4-pro (other values are rejected)
messagesArrayYesArray of message objects with role and content
max_tokensIntegerNoMaximum tokens to generate (default: 4096)
streamBooleanNoEnable streaming response (default: false)
temperatureFloatNoSampling temperature 0–2 (default: 0.7)
top_pFloatNoNucleus sampling (default: 0.9)

Message Object

FieldTypeDescription
roleStringsystem, user, or assistant
contentStringThe message content
Use CaseTemperature
Coding / Math0.0
Data Analysis1.0
General Conversation1.3
Translation1.3
Creative Writing1.5

Pricing

TierModelEndpointIC Cost
Flashdeepseek-v4-flash/v3/llm/deepseek-v4/chat/completions

Input: 0.01 IC / 1K tokens (~10 THB/1M)
Output: 0.02 IC / 1K tokens (~20 THB/1M)

Prodeepseek-v4-pro/v3/llm/deepseek-v4/chat/completions

Input: 0.20 IC / 1K tokens (~200 THB/1M)
Output: 0.40 IC / 1K tokens (~400 THB/1M)

For on-premise deployment of DeepSeek V4, please contact us.

Use Cases

  • Flash: chatbots, RAG over Thai documents, classification, structured extraction, customer service
  • Pro: multi-step agents, code refactors, mathematical reasoning, hard policy questions, evaluation grading

Support

For support and questions: