Anthropic Models

Explore the Anthropic language and embedding models available through our OpenAI Assistants API-compatible service.

Anthropic logo

Anthropic: Claude Haiku 4.5

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
64,000 tokens

Pricing:

Prompt: $0.000001
Completion: $0.000005
Input cache read: $0.0000001
Input cache write: $0.00000125

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance across reasoning, coding, and computer-use tasks, Haiku 4.5 brings frontier-level capability to real-time and high-volume applications.

It introduces extended thinking to the Haiku line; enabling controllable reasoning depth, summarized or interleaved thought output, and tool-assisted workflows with full support for coding, bash, web search, and computer-use tools. Scoring >73% on SWE-bench Verified, Haiku 4.5 ranks among the world’s best coding models while maintaining exceptional responsiveness for sub-agents, parallelized execution, and scaled deployment.

Anthropic: Claude Sonnet 4.5

Context Length:
1,000,000 tokens
Architecture:
text+image->text
Max Output:
64,000 tokens

Pricing:

Prompt: $0.000003
Completion: $0.000015

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-bench Verified, with improvements across system design, code security, and specification adherence. The model is designed for extended autonomous operation, maintaining task continuity across sessions and providing fact-based progress tracking.

Sonnet 4.5 also introduces stronger agentic capabilities, including improved tool orchestration, speculative parallel execution, and more efficient context and memory management. With enhanced context tracking and awareness of token usage across tool calls, it is particularly well-suited for multi-context and long-running workflows. Use cases span software engineering, cybersecurity, financial analysis, research agents, and other domains requiring sustained reasoning and tool use.

Anthropic: Claude Opus 4.1

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
32,000 tokens

Pricing:

Prompt: $0.000015
Completion: $0.000075
Image: $0.024
Input cache read: $0.0000015
Input cache write: $0.00001875

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains in multi-file code refactoring, debugging precision, and detail-oriented reasoning. The model supports extended thinking up to 64K tokens and is optimized for tasks involving research, data analysis, and tool-assisted reasoning.

Anthropic: Claude Opus 4

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
32,000 tokens

Pricing:

Prompt: $0.000015
Completion: $0.000075
Image: $0.024
Input cache read: $0.0000015
Input cache write: $0.00001875

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation.

Read more at the blog post here

Anthropic: Claude Sonnet 4

Context Length:
1,000,000 tokens
Architecture:
text+image->text
Max Output:
64,000 tokens

Pricing:

Prompt: $0.000003
Completion: $0.000015
Image: $0.0048
Input cache read: $0.0000003
Input cache write: $0.00000375

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios.

Read more at the blog post here

Anthropic: Claude 3.7 Sonnet

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
64,000 tokens

Pricing:

Prompt: $0.000003
Completion: $0.000015
Image: $0.0048
Input cache read: $0.0000003
Input cache write: $0.00000375

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes.

Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks.

Read more at the blog post here

Anthropic: Claude 3.7 Sonnet (thinking)

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
64,000 tokens

Pricing:

Prompt: $0.000003
Completion: $0.000015
Image: $0.0048
Input cache read: $0.0000003
Input cache write: $0.00000375

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes.

Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks.

Read more at the blog post here

Anthropic: Claude 3.5 Haiku

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
8,192 tokens

Pricing:

Prompt: $0.0000008
Completion: $0.000004
Web search: $0.01
Input cache read: $0.00000008
Input cache write: $0.000001

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic tasks such as chat interactions and immediate coding suggestions.

This makes it highly suitable for environments that demand both speed and precision, such as software development, customer service bots, and data management systems.

This model is currently pointing to Claude 3.5 Haiku (2024-10-22).

Anthropic: Claude 3.5 Haiku (2024-10-22)

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
8,192 tokens

Pricing:

Prompt: $0.0000008
Completion: $0.000004
Input cache read: $0.00000008
Input cache write: $0.000001

Claude 3.5 Haiku features enhancements across all skill sets including coding, tool use, and reasoning. As the fastest model in the Anthropic lineup, it offers rapid response times suitable for applications that require high interactivity and low latency, such as user-facing chatbots and on-the-fly code completions. It also excels in specialized tasks like data extraction and real-time content moderation, making it a versatile tool for a broad range of industries.

It does not support image inputs.

See the launch announcement and benchmark results here

Anthropic: Claude 3.5 Sonnet

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
8,192 tokens

Pricing:

Prompt: $0.000003
Completion: $0.000015
Image: $0.0048

New Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at:

  • Coding: Scores ~49% on SWE-Bench Verified, higher than the last best score, and without any fancy prompt scaffolding
  • Data science: Augments human data science expertise; navigates unstructured data while using multiple tools for insights
  • Visual processing: excelling at interpreting charts, graphs, and images, accurately transcribing text to derive insights beyond just the text alone
  • Agentic tasks: exceptional tool use, making it great at agentic tasks (i.e. complex, multi-step problem solving tasks that require engaging with other systems)

#multimodal

Anthropic: Claude 3.5 Sonnet (2024-06-20)

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
8,192 tokens

Pricing:

Prompt: $0.000003
Completion: $0.000015
Image: $0.0048
Input cache read: $0.0000003
Input cache write: $0.00000375

Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at:

  • Coding: Autonomously writes, edits, and runs code with reasoning and troubleshooting
  • Data science: Augments human data science expertise; navigates unstructured data while using multiple tools for insights
  • Visual processing: excelling at interpreting charts, graphs, and images, accurately transcribing text to derive insights beyond just the text alone
  • Agentic tasks: exceptional tool use, making it great at agentic tasks (i.e. complex, multi-step problem solving tasks that require engaging with other systems)

For the latest version (2024-10-23), check out Claude 3.5 Sonnet.

#multimodal

Anthropic: Claude 3 Haiku

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.00000025
Completion: $0.00000125
Image: $0.0004
Input cache read: $0.00000003
Input cache write: $0.0000003

Claude 3 Haiku is Anthropic's fastest and most compact model for
near-instant responsiveness. Quick and accurate targeted performance.

See the launch announcement and benchmark results here

#multimodal

Anthropic: Claude 3 Opus

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.000015
Completion: $0.000075
Image: $0.024
Input cache read: $0.0000015
Input cache write: $0.00001875

Claude 3 Opus is Anthropic's most powerful model for highly complex tasks. It boasts top-level performance, intelligence, fluency, and understanding.

See the launch announcement and benchmark results here

#multimodal

Ready to build with Anthropic?

Start using these powerful models in your applications with our flexible pricing plans.