Our frontier models are designed to spend more time thinking before producing a response, making them ideal for complex, multi-step problems.
Our most capable model for professional work
Input:
$2.50 / 1M tokens
Cached input:
$0.25 / 1M tokens
Output:
$15.00 / 1M tokens
A faster, cheaper version of GPT-5 for well-defined tasks
$0.250 / 1M tokens
$0.025 / 1M tokens
$2.000 / 1M tokens
Customize our models to get even higher performance for your specific use cases.
$3.00 / 1M tokens
$0.75 / 1M tokens
$12.00 / 1M tokens
Training:
$25.00 / 1M tokens
$0.80 / 1M tokens
$0.20 / 1M tokens
$3.20 / 1M tokens
$5.00 / 1M tokens
$0.05 / 1M tokens
$1.50 / 1M tokens
$4.00 / 1M tokens
$1.00 / 1M tokens
$16.00 / 1M tokens
$100.00 / training hour
Build low-latency, multimodal experiences including speech-to-speech.
Text
gpt-realtime-1.5
$4.00 / 1M input tokens
$0.40 / 1M cached input tokens
$16.00 / 1M output tokens
gpt-realtime
gpt-realtime-mini
$0.60 / 1M input tokens
$0.06 / 1M cached input tokens
$2.40 / 1M output tokens
Richly detailed, dynamic video generation and remixing with our latest generative model.
Precise, high-fidelity image generation and editing with our latest multimodal model.
GPT-image-1.5
$5.00 / 1M input tokens
$1.25 / 1M cached input tokens
GPT-image-1
GPT-image-1-mini
$2.00 / 1M input tokens
$0.20 / 1M cached input tokens
Prompts are billed similarly to other GPT models. Image outputs cost approximately $0.01 (low), $0.04 (medium), and $0.17 (high) for square images.
For detailed token usage by image quality and size, see the docs.
Our newest API combining the simplicity of Chat Completions with the built-in tool use of Assistants.
Responses API is not priced separately. Tokens are billed at the chosen language model’s input and output rates.
Build text-based conversational experiences.
Chat Completions API is not priced separately. Tokens are billed at the chosen language model's input and output rates.
Build assistant-like experiences with our tools.
Assistants API is not priced separately. Tokens are billed at the chosen language model's input and output rates.
Extend model capabilities with built-in tools in the API Platform.
$0.10 / GB of vector storage per day (first GB free)
$2.50 / 1k tool calls
There are two components that contribute to the cost of using the web search tool: (1) Tool calls and (2) Search content tokens.
1 For gpt-4o-mini and gpt-4.1-mini with the web search non-preview tool, search content tokens are charged as a fixed block of 8,000 input tokens per call.
The billing dashboard will report gpt-4.1 and gpt-4.1-mini search line items as ‘web search tool calls | gpt-4o’ and ‘web search tool calls | gpt-4o-mini’
GB refers to binary gigabytes of storage (also known as gibibyte), where 1GB is 2^30 bytes.
Build, deploy, and optimize production-grade agents with Agent Builder, ChatKit, and Evals.
Begins on November 1, 2025 — no charges will apply before then.
Free tier (per account, per month)
1 GB
Price beyond free tier
$0.10 / GB-day
Agent Builder – design and iterate with zero cost until you hit Run. Self-hosted ChatKit – host a custom ChatKit backend and pay only normal model-token charges. Enterprise controls – SSO, RBAC, and audit logs are included at no additional fee.
We recommend that developers use our large and mini GPT models for everyday tasks. Our large GPT models generally perform better on a wide range of tasks, while our mini GPT models are fast and inexpensive for simpler tasks.
Our large and mini reasoning models are ideal for complex, multi-step tasks and STEM use cases that require deep thinking about tough problems. You can choose the mini reasoning model if you're looking for a faster, more inexpensive option.
We recommend experimenting with all of these models in the Playground to explore which models provide the best price performance trade-off for your usage.