Model Matrix
Our LLM price comparison chart provides a clear view of today's leading large language models in one place. Each language model listed includes its context window, input and output pricing, as well a short description of its intended strengths. This LLM pricing reference can help developers and teams evaluate trade-offs across providers, balancing speed, accuracy, and memory length against operational cost.
Input and output tokens are priced separately for a reason: they represent fundamentally different costs to providers, and to you. The costs that can sneak up on you often come from input tokens – especially when you're doing extensive document processing, multiple file uploads, or building RAG systems that require ingesting large knowledge bases. Every PDF, contract, or customer record you upload consumes input tokens, and these costs compound quickly across your API usage.
Smart cost optimization starts with understanding your token costs holistically: input tokens for document processing and context window management, plus output tokens for generation. The most expensive LLM isn't necessarily the one with the highest per-output-token rate – it's the one that misaligns with your actual usage patterns. We help businesses analyze their specific workflows to identify where token costs accumulate and architect solutions that minimize unnecessary API costs while maximizing value from both input and output token optimization.