October 25, 2025
My AI development journey continued past the 60M token days. Learn why VS Code extensions were costly and how

In Part 1, I told you how I found my champion model, Claude 4 Sonnet, which finally stopped my codebase from imploding every time I attempted a refactor. I had my star player.
But a star player is useless without a good team and a smart playbook. My process was still a brute-force, high-cost mess. My monthly bill for Cursor was creeping towards the $200 mark, and I knew this wasn't sustainable. I was paying a premium for a firehose of tokens when what I really needed was a surgeon's scalpel.
This sent me down a new rabbit hole: the tools around the model.
My first stop was my trusty VS Code. I started experimenting with a slew of extensions that promised a Cursor-like experience: Continue, Roo Code, Cline, you name it. They were all interesting, but they shared a fatal flaw: what felt like very primitive caching.
Every interaction seemed to start from scratch, forcing the model to re-read and re-understand vast amounts of context. The result? I was spending even more money on API calls than before, even when testing cheaper, high-context models like Kimi K2. It was like trying to save money on gas by buying a cheaper car that got 5 miles per gallon.
I was missing a piece of the puzzle.
The real turning point came when I stopped just using the tools and started researching how they're measured. This led me to swe-bench, a benchmark that tests the ability of AI models to resolve real-world GitHub issues. Suddenly, a new world opened up. I discovered a category of tools designed not just to pipe prompts to an AI, but to create a structured framework that gets more out of each interaction.
I was on the hunt. I tested TRAE, an agentic framework that I absolutely loved for its logical approach. But I hit a frustrating wall: it wasn't available for purchase in my country, and relying on a VPN for my core development tool felt like building a house on shaky ground.
I moved on, testing others like OpenHands and Moatless Tools. They were powerful, but something was still missing.
Then I found zencoder.ai. And the pricing model alone made me stop and stare. It didn't count tokens. It counted requests.
My first thought was, "There has to be a catch." But I signed up for their initial $50/month plan, which gave me 500 requests per day. For a 6-8 hour workday, this was mostly enough. Some days I'd hit the limit, but the predictability was a breath of fresh air.
The real magic, however, was why this model worked. Zencoder isn't just a chat window. It's a platform built around a team of specialized AI Agents.
Think of it like this: instead of having one brilliant but overworked programmer doing everything, you have a team. There's an agent that just writes new code. Another that only runs and analyzes unit tests. A "reviewer" agent that checks for errors. A "Q&A" agent that can explain a piece of code. And the best part? You can create your own custom agents.
Plus, it had the one feature I knew was critical for advanced work: MCP server support. It came with a huge library of them pre-configured and the ability to add my own.
After a recent platform update, I upgraded to their $120/month plan. It's more expensive, yes, but it's still significantly cheaper than the $200 I was spending on Cursor for a far less efficient workflow.
Today, I get 1900 requests per day and access to the latest models, including Claude 4.5 Sonnet, Claude 4.5 Haiku, and even GPT-5 Codex. It's more than enough for my current needs.
I've finally found my stack. I have the model, and now I have the smart framework to manage it. The journey is far from over, though.
In my next posts, I’ll dive into the nitty-gritty of how I'm optimizing my work with these agents and share the specific prompts and workflows I'm using to build real projects without a single line of traditional programming knowledge.