Models by moonshotai for RAG Use Cases

MoonshotAI: Kimi K2 0905

Context Length:: 262,144 tokens
Architecture:: text->text
Max Output:: 262,144 tokens

Pricing:

Prompt: $0.00000039

Completion: $0.0000019

Kimi K2 0905 is the September update of Kimi K2 0711. It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It supports long-context inference up to 256k tokens, extended from the previous 128k.

This update improves agentic coding with higher accuracy and better generalization across scaffolds, and enhances frontend coding with more aesthetic and functional outputs for web, 3D, and related tasks. Kimi K2 is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. It excels across coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) benchmarks. The model is trained with a novel stack incorporating the MuonClip optimizer for stable large-scale MoE training.

MoonshotAI: Kimi K2 0711 (free)

Context Length:: 32,768 tokens
Architecture:: text->text

Pricing:

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. Kimi K2 excels across a broad range of benchmarks, particularly in coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) tasks. It supports long-context inference up to 128K tokens and is designed with a novel training stack that includes the MuonClip optimizer for stable large-scale MoE training.

MoonshotAI: Kimi K2 0711

Context Length:: 63,000 tokens
Architecture:: text->text
Max Output:: 63,000 tokens

Pricing:

Prompt: $0.00000014

Completion: $0.00000249

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. Kimi K2 excels across a broad range of benchmarks, particularly in coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) tasks. It supports long-context inference up to 128K tokens and is designed with a novel training stack that includes the MuonClip optimizer for stable large-scale MoE training.

MoonshotAI: Kimi Dev 72B (free)

Context Length:: 131,072 tokens
Architecture:: text->text

Pricing:

Kimi-Dev-72B is an open-source large language model fine-tuned for software engineering and issue resolution tasks. Based on Qwen2.5-72B, it is optimized using large-scale reinforcement learning that applies code patches in real repositories and validates them via full test suite execution—rewarding only correct, robust completions. The model achieves 60.4% on SWE-bench Verified, setting a new benchmark among open-source models for software bug fixing and code reasoning.

MoonshotAI: Kimi Dev 72B

Context Length:: 131,072 tokens
Architecture:: text->text
Max Output:: 131,072 tokens

Pricing:

Prompt: $0.00000029

Completion: $0.00000115

Kimi-Dev-72B is an open-source large language model fine-tuned for software engineering and issue resolution tasks. Based on Qwen2.5-72B, it is optimized using large-scale reinforcement learning that applies code patches in real repositories and validates them via full test suite execution—rewarding only correct, robust completions. The model achieves 60.4% on SWE-bench Verified, setting a new benchmark among open-source models for software bug fixing and code reasoning.

Moonshotai Models

MoonshotAI: Kimi K2 0905

MoonshotAI: Kimi K2 0711 (free)

MoonshotAI: Kimi K2 0711

MoonshotAI: Kimi Dev 72B (free)

MoonshotAI: Kimi Dev 72B

Ready to build with Moonshotai?