OpenAI Models

Explore the OpenAI language and embedding models available through our OpenAI Assistants API-compatible service.

OpenAI logo

OpenAI: GPT-5 Image Mini

Context Length:
400,000 tokens
Architecture:
text+image->text+image
Max Output:
128,000 tokens

Pricing:

Prompt: $0.0000025
Completion: $0.000002
Image: $0.0000025
Web search: $0.01
Input cache read: $0.00000025

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by GPT-5 Mini, with GPT Image 1 Mini for efficient image generation. This natively multimodal model features superior instruction following, text rendering, and detailed image editing with reduced latency and cost. It excels at high-quality visual creation while maintaining strong text understanding, making it ideal for applications that require both efficient image generation and text processing at scale.

OpenAI: GPT-5 Image

Context Length:
400,000 tokens
Architecture:
text+image->text+image
Max Output:
128,000 tokens

Pricing:

Prompt: $0.00001
Completion: $0.00001
Image: $0.00001
Web search: $0.01
Input cache read: $0.00000125

GPT-5 Image combines OpenAI's most advanced language model with state-of-the-art image generation capabilities. It offers major improvements in reasoning, code quality, and user experience while incorporating GPT Image 1's superior instruction following, text rendering, and detailed image editing.

OpenAI: o3 Deep Research

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.00001
Completion: $0.00004
Image: $0.00765
Web search: $0.01
Input cache read: $0.0000025

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks.

Note: This model always uses the 'web_search' tool which adds additional cost.

OpenAI: o4 Mini Deep Research

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.000002
Completion: $0.000008
Image: $0.00153
Web search: $0.01
Input cache read: $0.0000005

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks.

Note: This model always uses the 'web_search' tool which adds additional cost.

OpenAI: GPT-5 Pro

Context Length:
400,000 tokens
Architecture:
text+image->text
Max Output:
128,000 tokens

Pricing:

Prompt: $0.000015
Completion: $0.00012

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.

OpenAI: GPT-5 Codex

Context Length:
400,000 tokens
Architecture:
text+image->text
Max Output:
128,000 tokens

Pricing:

Prompt: $0.00000125
Completion: $0.00001
Input cache read: $0.000000125

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the reasoning.effort parameter. Read the docs here

Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

OpenAI: GPT-4o Audio

Context Length:
128,000 tokens
Architecture:
text->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.0000025
Completion: $0.00001
Audio: $0.00004

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input audio tokens.

OpenAI: GPT-5 Chat

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.00000125
Completion: $0.00001
Input cache read: $0.000000125

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

OpenAI: GPT-5

Context Length:
400,000 tokens
Architecture:
text+image->text
Max Output:
128,000 tokens

Pricing:

Prompt: $0.00000125
Completion: $0.00001
Web search: $0.01
Input cache read: $0.000000125

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.

OpenAI: GPT-5 Mini

Context Length:
400,000 tokens
Architecture:
text+image->text
Max Output:
128,000 tokens

Pricing:

Prompt: $0.00000025
Completion: $0.000002
Web search: $0.01
Input cache read: $0.000000025

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost. GPT-5 Mini is the successor to OpenAI's o4-mini model.

OpenAI: GPT-5 Nano

Context Length:
400,000 tokens
Architecture:
text+image->text
Max Output:
128,000 tokens

Pricing:

Prompt: $0.00000005
Completion: $0.0000004
Web search: $0.01
Input cache read: $0.000000005

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger counterparts, it retains key instruction-following and safety features. It is the successor to GPT-4.1-nano and offers a lightweight option for cost-sensitive or real-time applications.

OpenAI: gpt-oss-120b

Context Length:
131,072 tokens
Architecture:
text->text
Max Output:
131,072 tokens

Pricing:

Prompt: $0.00000004
Completion: $0.0000004

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.

OpenAI: gpt-oss-20b (free)

Context Length:
131,072 tokens
Architecture:
text->text
Max Output:
131,072 tokens

Pricing:

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.

OpenAI: gpt-oss-20b

Context Length:
131,072 tokens
Architecture:
text->text

Pricing:

Prompt: $0.00000003
Completion: $0.00000014

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.

OpenAI: o3 Pro

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.00002
Completion: $0.00008
Image: $0.0153
Web search: $0.01

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers.

Note that BYOK is required for this model. Set up here: https://openrouter.ai/settings/integrations

OpenAI: Codex Mini

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.0000015
Completion: $0.000006
Input cache read: $0.000000375

codex-mini-latest is a fine-tuned version of o4-mini specifically for use in Codex CLI. For direct use in the API, we recommend starting with gpt-4.1.

OpenAI: o4 Mini High

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.0000011
Completion: $0.0000044
Image: $0.0008415
Web search: $0.01
Input cache read: $0.000000275

OpenAI o4-mini-high is the same model as o4-mini with reasoning_effort set to high.

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains.

Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.

OpenAI: o3

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.000002
Completion: $0.000008
Image: $0.00153
Web search: $0.01
Input cache read: $0.0000005

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use it to think through multi-step problems that involve analysis across text, code, and images.

OpenAI: o4 Mini

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.0000011
Completion: $0.0000044
Image: $0.0008415
Web search: $0.01
Input cache read: $0.000000275

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains.

Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.

OpenAI: GPT-4.1

Context Length:
1,047,576 tokens
Architecture:
text+image->text
Max Output:
32,768 tokens

Pricing:

Prompt: $0.000002
Completion: $0.000008
Web search: $0.01
Input cache read: $0.0000005

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

OpenAI: GPT-4.1 Mini

Context Length:
1,047,576 tokens
Architecture:
text+image->text
Max Output:
32,768 tokens

Pricing:

Prompt: $0.0000004
Completion: $0.0000016
Web search: $0.01
Input cache read: $0.0000001

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

OpenAI: GPT-4.1 Nano

Context Length:
1,047,576 tokens
Architecture:
text+image->text
Max Output:
32,768 tokens

Pricing:

Prompt: $0.0000001
Completion: $0.0000004
Web search: $0.01
Input cache read: $0.000000025

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

OpenAI: o1-pro

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.00015
Completion: $0.0006
Image: $0.21675

The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently better answers.

OpenAI: GPT-4o-mini Search Preview

Context Length:
128,000 tokens
Architecture:
text->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.00000015
Completion: $0.0000006
Request: $0.0275
Image: $0.000217

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

OpenAI: GPT-4o Search Preview

Context Length:
128,000 tokens
Architecture:
text->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.0000025
Completion: $0.00001
Request: $0.035
Image: $0.003613

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

OpenAI: o3 Mini High

Context Length:
200,000 tokens
Architecture:
text->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.0000011
Completion: $0.0000044
Input cache read: $0.00000055

OpenAI o3-mini-high is the same model as o3-mini with reasoning_effort set to high.

o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities.

The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.

OpenAI: o3 Mini

Context Length:
200,000 tokens
Architecture:
text->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.0000011
Completion: $0.0000044
Input cache read: $0.00000055

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding.

This model supports the reasoning_effort parameter, which can be set to "high", "medium", or "low" to control the thinking time of the model. The default is "medium". OpenRouter also offers the model slug openai/o3-mini-high to default the parameter to "high".

The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities.

The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.

OpenAI: o1

Context Length:
200,000 tokens
Architecture:
text+image->text
Max Output:
100,000 tokens

Pricing:

Prompt: $0.000015
Completion: $0.00006
Image: $0.021675
Input cache read: $0.0000075

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought.

The o1 models are optimized for math, science, programming, and other STEM-related tasks. They consistently exhibit PhD-level accuracy on benchmarks in physics, chemistry, and biology. Learn more in the launch announcement.

OpenAI: GPT-4o (2024-11-20)

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.0000025
Completion: $0.00001
Image: $0.003613
Input cache read: $0.00000125

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with uploaded files, providing deeper insights & more thorough responses.

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.

OpenAI: o1-mini

Context Length:
128,000 tokens
Architecture:
text->text
Max Output:
65,536 tokens

Pricing:

Prompt: $0.0000011
Completion: $0.0000044
Input cache read: $0.00000055

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding.

The o1 models are optimized for math, science, programming, and other STEM-related tasks. They consistently exhibit PhD-level accuracy on benchmarks in physics, chemistry, and biology. Learn more in the launch announcement.

Note: This model is currently experimental and not suitable for production use-cases, and may be heavily rate-limited.

OpenAI: o1-mini (2024-09-12)

Context Length:
128,000 tokens
Architecture:
text->text
Max Output:
65,536 tokens

Pricing:

Prompt: $0.0000011
Completion: $0.0000044
Input cache read: $0.00000055

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding.

The o1 models are optimized for math, science, programming, and other STEM-related tasks. They consistently exhibit PhD-level accuracy on benchmarks in physics, chemistry, and biology. Learn more in the launch announcement.

Note: This model is currently experimental and not suitable for production use-cases, and may be heavily rate-limited.

OpenAI: ChatGPT-4o

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.000005
Completion: $0.000015
Image: $0.007225

OpenAI ChatGPT 4o is continually updated by OpenAI to point to the current version of GPT-4o used by ChatGPT. It therefore differs slightly from the API version of GPT-4o in that it has additional RLHF. It is intended for research and evaluation.

OpenAI notes that this model is not suited for production use-cases as it may be removed or redirected to another model in the future.

OpenAI: GPT-4o (2024-08-06)

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.0000025
Completion: $0.00001
Image: $0.003613
Input cache read: $0.00000125

The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respone_format. Read more here.

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.

For benchmarking against other models, it was briefly called "im-also-a-good-gpt2-chatbot"

OpenAI: GPT-4o-mini

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.00000015
Completion: $0.0000006
Image: $0.000217
Input cache read: $0.000000075

GPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs.

As their most advanced small model, it is many multiples more affordable than other recent frontier models, and more than 60% cheaper than GPT-3.5 Turbo. It maintains SOTA intelligence, while being significantly more cost-effective.

GPT-4o mini achieves an 82% score on MMLU and presently ranks higher than GPT-4 on chat preferences common leaderboards.

Check out the launch announcement to learn more.

#multimodal

OpenAI: GPT-4o-mini (2024-07-18)

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.00000015
Completion: $0.0000006
Image: $0.007225
Input cache read: $0.000000075

GPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs.

As their most advanced small model, it is many multiples more affordable than other recent frontier models, and more than 60% cheaper than GPT-3.5 Turbo. It maintains SOTA intelligence, while being significantly more cost-effective.

GPT-4o mini achieves an 82% score on MMLU and presently ranks higher than GPT-4 on chat preferences common leaderboards.

Check out the launch announcement to learn more.

#multimodal

OpenAI: GPT-4o

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
16,384 tokens

Pricing:

Prompt: $0.0000025
Completion: $0.00001
Image: $0.003613
Input cache read: $0.00000125

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.

For benchmarking against other models, it was briefly called "im-also-a-good-gpt2-chatbot"

#multimodal

OpenAI: GPT-4o (extended)

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
64,000 tokens

Pricing:

Prompt: $0.000006
Completion: $0.000018
Image: $0.007225

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.

For benchmarking against other models, it was briefly called "im-also-a-good-gpt2-chatbot"

#multimodal

OpenAI: GPT-4o (2024-05-13)

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.000005
Completion: $0.000015
Image: $0.007225

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.

For benchmarking against other models, it was briefly called "im-also-a-good-gpt2-chatbot"

#multimodal

OpenAI: GPT-4 Turbo

Context Length:
128,000 tokens
Architecture:
text+image->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.00001
Completion: $0.00003
Image: $0.01445

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling.

Training data: up to December 2023.

OpenAI: GPT-3.5 Turbo (older v0613)

Context Length:
4,095 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.000001
Completion: $0.000002

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.

Training data up to Sep 2021.

OpenAI: GPT-4 Turbo Preview

Context Length:
128,000 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.00001
Completion: $0.00003

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023.

Note: heavily rate limited by OpenAI while in preview.

OpenAI: GPT-4 Turbo (older v1106)

Context Length:
128,000 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.00001
Completion: $0.00003

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling.

Training data: up to April 2023.

OpenAI: GPT-3.5 Turbo Instruct

Context Length:
4,095 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.0000015
Completion: $0.000002

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.

OpenAI: GPT-3.5 Turbo 16k

Context Length:
16,385 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.000003
Completion: $0.000004

This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up to Sep 2021.

OpenAI: GPT-3.5 Turbo

Context Length:
16,385 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.0000005
Completion: $0.0000015

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.

Training data up to Sep 2021.

OpenAI: GPT-4

Context Length:
8,191 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.00003
Completion: $0.00006

OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning capabilities. Training data: up to Sep 2021.

OpenAI: GPT-4 (older v0314)

Context Length:
8,191 tokens
Architecture:
text->text
Max Output:
4,096 tokens

Pricing:

Prompt: $0.00003
Completion: $0.00006

GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and was supported until June 14. Training data: up to Sep 2021.

Ready to build with OpenAI?

Start using these powerful models in your applications with our flexible pricing plans.