Posts
Notes on AI token economics.
-
AI Document Processing: Why $30K Model Costs Hide $200K Infrastructure Problems
Why companies obsess over AI model costs while ignoring 95% of their document processing expenses. Build resilient extraction pipelines that prioritize reliability over price.
-
Why Inference Costs More Than Training (And Always Will)
Training gets all the glory in AI, but inference does all the work. Understanding the economics and physics behind AI deployment.
-
AI Retry Logic: How Bad Error Handling Turns $10 Failures into $1000 Bills
How naive retry logic with GPT-5 and Claude APIs can multiply costs by 5x, plus exponential backoff strategies that prevent $12K monthly overcharges.
-
Why Max Tokens Defaults Are Draining Your Budget
Discover how unconfigured max tokens settings are silently multiplying your AI API costs by 2-5x and learn strategic frameworks to optimize token usage without sacrificing quality.
-
System Prompt Optimization: Stop Paying 40x More for Redundant AI Instructions
Why static system prompts running 50,000 times daily waste $47K monthly, and how dynamic context injection cuts Claude API costs by 87.5%.
-
Conversation History Costs: Why Context Windows Drain AI Budgets 10x Faster
How sending full conversation history with every API call can multiply your AI costs by 10x, plus a sliding window solution that cuts costs by 87.5%.