Examples
Here’s what it looks like in practice — natural language questions, instant pricing answers:

Below are detailed examples of the three most common workflows.
Cost comparison
Section titled “Cost comparison”Prompt:
“I’m choosing between Claude Sonnet and GPT-5.2 for a summarization pipeline. Compare their pricing and context windows.”
Tool called: compare_models
Output:
Top 5 most cost-effective models matching your criteria:
1. gpt-5.2 Provider: openai | Mode: chat Input: $2.00/1M | Output: $8.00/1M Context: 1,047K in / 64K out Capabilities: function_calling, vision, parallel_function_calling
2. claude-sonnet-4-5 Provider: anthropic | Mode: chat Input: $3.00/1M | Output: $15.00/1M Context: 200K in / 8K out Capabilities: function_calling, vision, parallel_function_callingTakeaway: GPT-5.2 is cheaper per token and has a much larger context window. Claude Sonnet 4.5 costs ~50% more on input and nearly 2x on output, but may be preferred for quality on specific tasks. For a high-volume summarization pipeline, the pricing difference adds up.
Discovery
Section titled “Discovery”Prompt:
“What’s the biggest context window I can get for under $1/1M input tokens?”
Tool called: compare_models
Output:
Top 5 most cost-effective models matching your criteria:
1. gemini/gemini-2.5-flash Provider: gemini | Mode: chat Input: $0.15/1M | Output: $0.60/1M Context: 1,048K in / 65K out Capabilities: function_calling, vision, parallel_function_calling
2. gemini/gemini-2.0-flash Provider: gemini | Mode: chat Input: $0.10/1M | Output: $0.40/1M Context: 1,048K in / 8K out Capabilities: function_calling, vision, parallel_function_calling
3. deepseek/deepseek-chat Provider: deepseek | Mode: chat Input: $0.14/1M | Output: $0.28/1M Context: 64K in / 8K out Capabilities: function_calling, parallel_function_calling
4. gemini/gemini-2.0-flash-lite Provider: gemini | Mode: chat Input: $0.04/1M | Output: $0.15/1M Context: 1,048K in / 8K out Capabilities: function_calling, vision, parallel_function_calling
5. mistral/mistral-small-latest Provider: mistral | Mode: chat Input: $0.10/1M | Output: $0.30/1M Context: 32K in / 32K out Capabilities: function_calling, parallel_function_callingTakeaway: Gemini models dominate the budget-friendly, large-context space — offering 1M+ token windows at fractions of a cent. If you need a massive context window on a tight budget, Gemini 2.5 Flash gives you the best balance of capability and cost.
Batch planning
Section titled “Batch planning”Prompt:
“How much would it cost to run Claude Opus vs Sonnet for a batch of 10K requests, averaging 2K input and 500 output tokens each?”
Tools called: calculate_estimate (twice — once per model)
Claude Opus 4.6:
Model: claude-opus-4-6Provider: anthropic
Token counts: Input: 20,000,000 tokens Output: 5,000,000 tokens
Estimated cost: Input: $300.00 Output: $150.00 Total: $450.00
Rate per 1M tokens: Input: $15.00 Output: $30.00Claude Sonnet 4.6:
Model: claude-sonnet-4-6Provider: anthropic
Token counts: Input: 20,000,000 tokens Output: 5,000,000 tokens
Estimated cost: Input: $60.00 Output: $75.00 Total: $135.00
Rate per 1M tokens: Input: $3.00 Output: $15.00Takeaway: Sonnet is 3.3x cheaper for this workload ($135 vs $450). Unless the task specifically requires Opus-level reasoning, Sonnet saves $315 per 10K-request batch — making it the clear choice for high-volume pipelines.