X-AI Models

Explore the X-AI language and embedding models available through our OpenAI Assistants API-compatible service.

xAI: Grok 4 Fast

Context Length:: 2,000,000 tokens
Architecture:: text+image->text
Max Output:: 30,000 tokens

Pricing:

Prompt: $0.0000002

Completion: $0.0000005

Input cache read: $0.00000005

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. It comes in two flavors: non-reasoning and reasoning. Read more about the model on xAI's news post. Reasoning can be enabled using the reasoning enabled parameter in the API. Learn more in our docs

Prompts and completions on Grok 4 Fast Free may be used by xAI or OpenRouter to improve future models.

xAI: Grok Code Fast 1

Context Length:: 256,000 tokens
Architecture:: text->text
Max Output:: 10,000 tokens

Pricing:

Prompt: $0.0000002

Completion: $0.0000015

Input cache read: $0.00000002

Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding. With reasoning traces visible in the response, developers can steer Grok Code for high-quality work flows.

xAI: Grok 4

Context Length:: 256,000 tokens
Architecture:: text+image->text

Pricing:

Prompt: $0.000003

Completion: $0.000015

Input cache read: $0.00000075

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not exposed, reasoning cannot be disabled, and the reasoning effort cannot be specified. Pricing increases once the total tokens in a given request is greater than 128k tokens. See more details on the xAI docs

xAI: Grok 3 Mini

Context Length:: 131,072 tokens
Architecture:: text->text

Pricing:

Prompt: $0.0000003

Completion: $0.0000005

Input cache read: $0.000000075

A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. The raw thinking traces are accessible.

xAI: Grok 3

Context Length:: 131,072 tokens
Architecture:: text->text

Pricing:

Prompt: $0.000003

Completion: $0.000015

Input cache read: $0.00000075

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, healthcare, law, and science.

xAI: Grok 3 Mini Beta

Context Length:: 131,072 tokens
Architecture:: text->text

Pricing:

Prompt: $0.0000003

Completion: $0.0000005

Input cache read: $0.000000075

Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that generate answers immediately, Grok 3 Mini thinks before responding. It’s ideal for reasoning-heavy tasks that don’t demand extensive domain knowledge, and shines in math-specific and quantitative use cases, such as solving challenging puzzles or math problems.

Transparent "thinking" traces accessible. Defaults to low reasoning, can boost with setting reasoning: { effort: "high" }

Note: That there are two xAI endpoints for this model. By default when using this model we will always route you to the base endpoint. If you want the fast endpoint you can add provider: { sort: throughput}, to sort by throughput instead.

xAI: Grok 3 Beta

Context Length:: 131,072 tokens
Architecture:: text->text

Pricing:

Prompt: $0.000003

Completion: $0.000015

Input cache read: $0.00000075

Excels in structured tasks and benchmarks like GPQA, LCB, and MMLU-Pro where it outperforms Grok 3 Mini even on high thinking.

Ready to build with X-AI?

Start using these powerful models in your applications with our flexible pricing plans.

View Pricing