Posts

Notes on AI token economics.

Why Premium Models Waste Money on Extraction Tasks

17 Aug, 2025

How choosing GPT-5 flagship over specialized extraction solutions wastes $387K annually. Smart model routing strategies that cut AI costs by 60-80% without sacrificing quality.
Why $30K Model Costs Hide $200K Infrastructure Problems

15 Aug, 2025

Why companies obsess over AI model costs while ignoring 95% of their document processing expenses. Build resilient extraction pipelines that prioritize reliability over price.
Why Inference Costs More Than Training (And Always Will)

13 Aug, 2025

Training gets all the glory in AI, but inference does all the work. Understanding the economics and physics behind AI deployment.
How Bad Error Handling Turns $10 Failures into $1000 Bills

12 Aug, 2025

How naive retry logic with GPT-5 and Claude APIs can multiply costs by 5x, plus exponential backoff strategies that prevent $12K monthly overcharges.
Why Max Tokens Defaults Are Draining Your Budget

10 Aug, 2025

Discover how unconfigured max tokens settings are silently multiplying your AI API costs by 2-5x and learn strategic frameworks to optimize token usage without sacrificing quality.
Stop Paying 40x More for Redundant AI Instructions

20 Jun, 2025

Why static system prompts running 50,000 times daily waste $47K monthly, and how dynamic context injection cuts Claude API costs by 87.5%.
Why Context Windows Drain AI Budgets 10x Faster

15 May, 2025

How sending full conversation history with every API call can multiply your AI costs by 10x, plus a sliding window solution that cuts costs by 87.5%.

Posts

Why Premium Models Waste Money on Extraction Tasks

Why $30K Model Costs Hide $200K Infrastructure Problems

Why Inference Costs More Than Training (And Always Will)

How Bad Error Handling Turns $10 Failures into $1000 Bills

Why Max Tokens Defaults Are Draining Your Budget

Stop Paying 40x More for Redundant AI Instructions

Why Context Windows Drain AI Budgets 10x Faster