What are the basic pricing models for GPT 5.4?

OpenAI’s GPT-5.4, released on March 5, 2026, operates primarily under a token-based pricing model, with variations depending on the model variant and context window usage. This model charges users based on the number of input tokens sent to the model and output tokens generated by the model. There are distinct pricing tiers for the standard GPT-5.4 model and a more premium GPT-5.4 Pro variant, reflecting their differing capabilities and computational requirements. Additionally, OpenAI implements a “cached input” pricing for repeating contexts, offering a discount, and a surcharge for extremely long context windows.

For the standard GPT-5.4 model, the API pricing is $2.50 per 1 million input tokens and $15.00 per 1 million output tokens for sessions under 272K context tokens. A cached input for the standard model is priced at $1.25 per 1 million tokens, representing a 50% discount for repeating contexts. The GPT-5.4 Pro API, designed for higher-stakes enterprise tasks and deep-horizon reasoning, carries a substantially higher cost of $30.00 per 1 million input tokens and $180.00 per 1 million output tokens. This significant price difference reflects the specialized hardware and increased computational resources required for the Pro variant’s enhanced capabilities. A critical aspect of the pricing structure is the long-context surcharge; input costs double once a session exceeds 272K tokens, and output costs increase by 1.5x for the full session, making the analysis of very large datasets more expensive.

Beyond the raw token costs, access to GPT-5.4 is also differentiated across OpenAI’s offerings. In ChatGPT, users with paid subscriptions can access “GPT-5.4 Thinking,” which provides an upfront plan of its reasoning and is improved for deeper web research and longer reasoning tasks. “GPT-5.4 Pro” is available for higher-tier plans like Pro, Business, Enterprise, and Edu within ChatGPT. For developers, the API provides direct access to gpt-5.4 and gpt-5.4-pro with specified technical parameters and explicit pricing for each. This tiered access ensures that both individual users seeking advanced conversational AI and developers building complex applications can leverage GPT-5.4, often integrating it with other AI tools or vector databases like Milvus for efficient similarity search and retrieval-augmented generation.

What are the basic pricing models for GPT 5.4?

Need a VectorDB for Your GenAI Apps?

Recommended Tech Blogs & Tutorials

Keep Reading

When comparing two different vector databases or ANN algorithms, how should one interpret differences in their recall@K for a fixed K? (For instance, is a 5% recall improvement significant in practice?)

How can few-shot examples be utilized in a RAG prompt to demonstrate how the model should use retrieved information (for instance, providing an example question, the context, and the answer as a guide)?

How do I handle concept drift in embedding models over time?

How does Gemini CLI compare to Claude’s code interpreter?