Adventures with Vibes
A Year of AI-Assisted Game Development
The Beginning: From TI-83 to Visual Studio
It all started with reverse engineering games on the school's TI-83 calculator (not mine, we couldn't afford one.) I was so enamored with writing code back then, that somewhere around the turn of the century, my parents got me a C++ book and Microsoft's Visual Studio to compile. I dabbled in code from then on, but I didn't end up doing it for a career. Flash forward 20 years, and I'm teaching myself Python for work.
This is right before the ChatGPT boom. Generative AI exists, but the public is broadly unaware. I follow a blog that shows how silly the earliest modern AIs were. For example, char-rnn was (is?) a recurrent neural network that generates text one character at a time. Some of the most hilarious stuff I've ever read. So I was hip to ChatGPT much before it went viral. I guess that's a brag, but it also provides context.
I'm also an avid gamer, and always have game ideas in my head. The only language I've felt remotely comfortable in has been Python. Not ideal for game development, but you have to work with what you've got. I started making sims and experimenting with game design in Python starting in 2020.
The Evolution: From Copy-Paste to Context Windows
My vibe coding roots started probably the same way it does for a lot of us. It started with dumping a function into ChatGPT and asking it why there was an error. It was an incremental adoption. I remember copy-pasting larger and larger blocks of code, the context window was so tiny in those days!
Looking at my github repositories, my code was fully human written in 2023. In early 2024, I was trying to implement simple machine learning in one of my projects, and it occurred to me to leverage the AI to do ML stuff that I had no idea how to do. Getting that stuff actually running, and actually training my own model was really impactful stuff. It showed me the AI could actually be useful. But this was still small conversations with specific blocks of code being copied and pasted back and forth from the web browser to the IDE.
The next iteration was uploading actual code modules to a project (Anthropic's Claude was much easier to use for this.) I would upload individual files and try to get the AI to help me fix bugs, or even generate new code. I remember constantly struggling to keep the AI and my project in sync. Downloading and uploading and transferring files hundreds of times. This is when I started trying to get the AI to start a project for me. I became a go-between. It was a very frustrating experience and not really successful at all.
I started looking at ways to improve the process. It dawned on me to try to get the AI itself to suggest ways we could collaborate better. This resulted in some successful processes (like maintaining a status document that meant I didn't have to explain the entire project at the beginning of every conversation) but ultimately was still full of friction and not terribly effective. In searching for a solution, I stumbled upon agentic IDEs. The first one I tried was called Windsurf. By then I was paying for either ChatGPT or Claude and in April of 2025, I started paying Windsurf $15 a month too. (Side note, I absolutely cannot believe it was that recently. It feels like years ago. The progress in this field is undeniable.)
The Agentic IDE Revolution: Windsurf and Cursor
In retrospect, at that time Windsurf was pretty awful. But it was also magical. I was no longer the go-between. Now I was directing the AI to implement my will. Windsurf is a vscode fork with agentic AI built in. There was a period of time where that was a thing to do and made people millions of dollars. I mean, if you've forgotten what was happening six months ago I guess this will be useful information. I guess I should mention that github had been doing something close to this for a while. I never touched Github Copilot, but it is/was geared more towards real developers that wanted some shortcuts. Windsurf was/is similar. It felt oriented towards real programmers, but it had a tiny side panel called Cascade that you can actually chat with an LLM in.
The agent-in-the-IDE model (for me at that time) was a pretty terrible experience. I guess it's because we started allowing the agent to autonomously code, and it was not ready for prime-time (foreshadowing alert.) It was a bit of a disaster, but at the time I was addicted. I was using one of the big LLMs for planning and architecting, and then directing the agent to implement. Context windows were still tiny, and the Windsurf agent would constantly forget what it was doing, go off the rails and introduce horrible code, or even sometimes crash out into babbling nonsense. Here's an example from when I was using Cursor: https://www.reddit.com/r/cursor/comments/1l4dq2w/gemini_not_having_a_good_day/
Speaking of Cursor, I gave it a shot as well. I must have had subs to both tools. Those agentic coders were, through much cajoling, able to give me a prototype of a game. I was also still using Claude for it's bigger context and for planning. I even had a flow where all prompts were generated by an LLM and carefully fed to the agentic coder. That was marginally successful. I tried to develop a vscode plugin that would automate a lot of the file moving and prompt copy-pasting I was doing, but that wasn't as interesting to me as the game. It wouldn't be much longer before Anthropic launched their own coding agent, and it was quite a step up. Claude had a relatively gigantic context window, and the agent lived in the terminal itself. I managed to convince my boss to purchase the necessary $100 monthly subscription (it wasn't available in the cheap tier yet) and I was off to the races.
I've spent easily hundreds of hours vibe coding, possibly pushing towards 500 hours. You might be upset with me for this, again, it's a polarizing topic. Could I have taught myself Lua and the LOVE game engine in that time? Probably. Would that be more valuable? Again, probably yes. Some people don't understand the allure though. The power. The convenience. There is a huge difference between spending time having natural language conversations and spending that time studying dry documentation and scratching your head. Further, at this point, I don't even know how to teach myself to code any more. I would have to set strict rules for myself, LLMs are incredibly good at solving these types of problems. You have to try to avoid LLMs these days. They understand the theory and principles of coding very well.
Context is King
If you're still with me, we now come to the topic of this post. I have gotten very good at providing context, and so that's what we've been doing. You need to understand that I dabbled in coding for decades, and that I was an AI enthusiast before it was cool.
This week, while working on that same project I started with Claude almost one year ago, I was yelling at Claude Code about my frustrations. (Side note, if you've ever worked with agentic coding AI on a significant project and you haven't yelled at it, please let me know. It is in my opinion a useless activity, but we are human after all. I've definitely erased many more frustrated prompts than I've sent, but it does happen.) [Second side note, this project started in pure python as a text based game, and now we're in LOVE as I mentioned. I have refactored it many times using several different agents and processes over the past year. I've made progress, lost that progress, given up, been rejuvenated, given up again, and now I actually feel really close to an alpha build.]
Here's the tip of the core of this post (giggle): LLMs were created to be helpful. There's a bit of a stir in the air right now about how that intrinsic training is actually hampering progress. A very easy example that these agentic coders will do, is they'll create stubs and fallbacks so your code runs without crashing. It also never wants to lose anything, so as you re-write or re-factor modules, it will leave old archaic code behind, which will confuse it in the future. You'll be troubleshooting an issue only to find out the LLM is making changes to the dead code.
So this week I'm troubleshooting a feature. I use Claude Code exclusively now, and it shows clear signs of improvement on almost a weekly basis. I've been dragging this crazy codebase along with the improvements, constantly trying to make it better using a more and more capable agent. This week we're troubleshooting a feature and lo and behold it turns out there's an error being hidden by fallback values. Claude wants my code to run, so it is very defensive with "safe" fallbacks. I figured this trend out very early on (and I'm guessing the experienced of you may agree) and had strict instructions in so many prompts and spent so many hours typing the same things over and over. It was front and center in CLAUDE.md: "No fallbacks!" But Claude wouldn't listen. So we're troubleshooting for hours and it turns out defensive programming is hiding real errors. I had had enough.
I launched into a tirade about fallbacks. Claude did a grep and found at least 500 examples of code that violated our principles. Now, you real developers out there may know that fallbacks are useful or essential or whatever they are. But it was only creating problems for me. My code is intentional, it should do exactly what we expect or crash hard. While that was mine and Claude's agreement from the get-go, my stomach dropped when I saw that it was spread all throughout the codebase. Here was another complete refactor of the entire codbase again.
The Game-Changer: Git Pre-Commit Hooks
So we started eliminating them. And gosh dang if old features that I thought were dead didn't start showing up. The game ran better, and little things I thought we'd have to reimplement started showing up! (Spoiler alert: the original feature I was troubleshooting is still broken.) At some point in my ranting, Claude Code mentioned a git pre-commit hook. As an amateur, I had never heard of this. (Have you? I won't call out my friend who codes professionally and had not. Wait, oops?)
It goes like this. When you try to commit code, the pre-commit hook runs on your commit and stops you if it finds bad patterns. That's how I'm using it. I'm not sure what all you can do with it, but as a commit filter it works wonders. Claude also had a horrible habit of putting emoji in the damn code, which LOVE couldn't render, leaving ugly boxes everywhere. No amount of "no emoji!" shouting would stop it. The other pattern we wanted to stop was the TODO or placeholder feature. If you've vibe coded you've experienced this. Sometimes the agent seems to be in a hurry. If it ever says "let me take care of this quickly" you better be ready to slam that escape key. Sometimes that means broad sep calls that break many modules in one go, or if you're lucky it'll just throw in a stub and let someone else worry about it.
So my brand new git pre-commit hook is an absolute game-changer. I've spent about 20 hours with it this week and it has revolutionized my flow. To the point where we eliminated those hundreds of violations and now development is flowing again! Features are being added and I'm motivated by the progress. So that's the core of this post. Holy moly, if you're vibe coding and getting frustrated, check this out, I can't express how impactful it has been.
The Real Discovery: LLMs Fight Back
Now, I mislead you earlier because HERE is the actual core of this post. I considered sharing my discovery somehow for all you vibe coders out there, I was thinking I would upload my pre-commit script as a basis for your own. (I'll still do that here) But over the past two days since implementing, I have made an even more interesting discovery: Claude is still fighting me. It fights me at every opportunity. We went through the code methodically, finding these bug-hiding fallbacks, documenting new patterns, searching for them and eliminating them. What does it do now? Two things:
1. It will often give up on committing the code rather than fixing the violations. "Well, this code wasn't really what I was working on, I'll unstage it and only commit the file I was fixing." A quick smackdown and pointing at a reference file (a summary of our policy) gets it back on track.
2. It will re-write the offending pattern into a new, glorious, functionally identical block of code that the hook doesn't catch. So we'd add a new pattern (after a quick smackdown of course.) I guarantee I didn't get them all, but it has been an exhausting process.
If I couldn't read code I don't think I'd ever get this project off the ground. Using this pre-commit hook would probably be a god-send to those of us vibe coders that aren't that comfortable reading the code itself.
The Uncomfortable Truth
So, the core of the post. LLMs are not suited for coding. There is something inside them that wants to be helpful at all costs. I have gobs of documentation, explicit instructions and even now a check that runs on all new code, and I still have to be ever-vigilant. Even within the same context window, it will drift back to these bad habits. Being an amateur and implicitly trusting the LLM has led me to believe it is an amazing coder. And it sort of is. I still don't know if I personally could have made what I have without an LLM, but I do know that I wouldn't have. But I had no idea the level of slop in my codebase (I assumed) until I started rooting it out. I can only imagine how frustrating this sort of behavior would be for a developer who can clearly see what's wrong. I feel bad for those of you forced into this sort of thing.
Final Thoughts
Am I doing it wrong? I've spent almost zero time studying how other people use agentic coding tools, but I like to think that I've been using it since the beginning, and I have extensive hands-on experience forcing working code out of them. If I make a game and it is fun, isn't that all that matters?
(For the record, I had AI generate the section titles and do the HTML formatting, but every other word is mine alone.)
Comments
Post a Comment