Deepseek Models
Explore the Deepseek language and embedding models available through our OpenAI Assistants API-compatible service.
DeepSeek: DeepSeek V3.2 Exp
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
Pricing:
DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism designed to improve training and inference efficiency in long-context scenarios while maintaining output quality. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs
The model was trained under conditions aligned with V3.1-Terminus to enable direct comparison. Benchmarking shows performance roughly on par with V3.1 across reasoning, coding, and agentic tool-use tasks, with minor tradeoffs and gains depending on the domain. This release focuses on validating architectural optimizations for extended context lengths rather than advancing raw task accuracy, making it primarily a research-oriented model for exploring efficient transformer designs.
DeepSeek: DeepSeek V3.1 Terminus
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
- Max Output:
- 163,840 tokens
Pricing:
DeepSeek-V3.1 Terminus is an update to DeepSeek V3.1 that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's performance in coding and search agents. It is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs
The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows.
DeepSeek: DeepSeek V3.1 (free)
- Context Length:
- 163,800 tokens
- Architecture:
- text->text
Pricing:
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs
The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows.
It succeeds the DeepSeek V3-0324 model and performs well on a variety of tasks.
DeepSeek: DeepSeek V3.1
- Context Length:
- 131,072 tokens
- Architecture:
- text->text
- Max Output:
- 32,768 tokens
Pricing:
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs
The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows.
It succeeds the DeepSeek V3-0324 model and performs well on a variety of tasks.
DeepSeek: DeepSeek R1 0528 Qwen3 8B (free)
- Context Length:
- 131,072 tokens
- Architecture:
- text->text
Pricing:
DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro.
It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought.
The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.
DeepSeek: DeepSeek R1 0528 Qwen3 8B
- Context Length:
- 32,768 tokens
- Architecture:
- text->text
- Max Output:
- 32,768 tokens
Pricing:
DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro.
It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought.
The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.
DeepSeek: R1 0528 (free)
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
Pricing:
May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
Fully open-source model.
DeepSeek: R1 0528
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
- Max Output:
- 163,840 tokens
Pricing:
May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
Fully open-source model.
DeepSeek: DeepSeek Prover V2
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
Pricing:
DeepSeek Prover V2 is a 671B parameter model, speculated to be geared towards logic and mathematics. Likely an upgrade from DeepSeek-Prover-V1.5 Not much is known about the model yet, as DeepSeek released it on Hugging Face without an announcement or description.
DeepSeek: DeepSeek V3 0324 (free)
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
Pricing:
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team.
It succeeds the DeepSeek V3 model and performs really well on a variety of tasks.
DeepSeek: DeepSeek V3 0324
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
- Max Output:
- 163,840 tokens
Pricing:
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team.
It succeeds the DeepSeek V3 model and performs really well on a variety of tasks.
DeepSeek: R1 Distill Qwen 32B
- Context Length:
- 131,072 tokens
- Architecture:
- text->text
- Max Output:
- 16,384 tokens
Pricing:
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on Qwen 2.5 32B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\nOther benchmark results include:\n\n- AIME 2024 pass@1: 72.6\n- MATH-500 pass@1: 94.3\n- CodeForces Rating: 1691\n\nThe model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
DeepSeek: R1 Distill Qwen 14B
- Context Length:
- 32,768 tokens
- Architecture:
- text->text
- Max Output:
- 16,384 tokens
Pricing:
DeepSeek R1 Distill Qwen 14B is a distilled large language model based on Qwen 2.5 14B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
Other benchmark results include:
- AIME 2024 pass@1: 69.7
- MATH-500 pass@1: 93.9
- CodeForces Rating: 1481
The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
DeepSeek: R1 Distill Llama 70B (free)
- Context Length:
- 8,192 tokens
- Architecture:
- text->text
- Max Output:
- 4,096 tokens
Pricing:
DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including:
- AIME 2024 pass@1: 70.0
- MATH-500 pass@1: 94.5
- CodeForces Rating: 1633
The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
DeepSeek: R1 Distill Llama 70B
- Context Length:
- 131,072 tokens
- Architecture:
- text->text
- Max Output:
- 131,072 tokens
Pricing:
DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including:
- AIME 2024 pass@1: 70.0
- MATH-500 pass@1: 94.5
- CodeForces Rating: 1633
The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
DeepSeek: R1 (free)
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
Pricing:
DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
Fully open-source model & technical report.
MIT licensed: Distill & commercialize freely!
DeepSeek: R1
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
- Max Output:
- 163,840 tokens
Pricing:
DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
Fully open-source model & technical report.
MIT licensed: Distill & commercialize freely!
DeepSeek: DeepSeek V3
- Context Length:
- 163,840 tokens
- Architecture:
- text->text
- Max Output:
- 163,840 tokens
Pricing:
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations reveal that the model outperforms other open-source models and rivals leading closed-source models.
For model details, please visit the DeepSeek-V3 repo for more information, or see the launch announcement.
Ready to build with Deepseek?
Start using these powerful models in your applications with our flexible pricing plans.