Introduction
I recently decided to spend a day "Vibe Programming" with Cursor. I set myself the challenge of not writing any code myself, I could only prompt my way to the end goal. I decided to do this because I haven't really given AI tooling a fair crack, I've used it sporadically claiming that I can do it better, faster and AI guides me to lower quality solutions. However in the interest of not becoming a dinosaur I decided that I should keep a more open mind about my workflow and tooling I use. In an effort to be able to give it a fair analysis I decided to do two projects, one backend focused where I am (so full of hubris) strong and another where I am mediocre at best, web presentation. I won't go into detail on the projects themselves just the summary of my conclusions.
I was oddly excited to do some low effort programming so lets breakdown the good the bad and the ugly of it all.
The Good
Instant boilerplate solutions
Backend - It was able to create me API boilerplate that was passable and nuget package setup that was also fine. It was really hooked on using dotnet 9 and had zero concept of what dotnet 10 was (at the time of writing its in preview 2), but it was totally fine.
Frontend - I needed a react application that I could deploy through AWS Amplify and it chose vite and react with typescript. A totally reasonable selection. It took a good while to figure out the vite toolchain but i set it to "agent mode" and told it to continue till it worked and went off to make tea. I came back a few minutes later and it had written a small essay to itself, but the website was functional.
Simple Problems and bug fixes
It was able to iterate and try different solutions in a "brute force" style approach till something worked in both the project examples.
It was fast, like really fast. In both projects it was able to give me a decent starting point in a matter of minutes.
The Bad
Overconfidence in wrong answers
It is really misleading when a prompt is answer with "ah yes, this is the answer" when its absolutely not the answer; to the point where the solution doesn't compile or doesn't start.
Maybe this next take is biased with my experience with AI but it seemed to be focused on telling a good narrative that moving towards a more correct answer. When trying to focus down a slightly off solution it would often rewrite the whole thing when only a small tweak was required.
Seemingly arbitrary use of third party packages
Less so in the dotnet but in the JavaScript ecosystem Cursor would pull in packages left right and centre to accomplish tasks. In some scenarios that might be correct, but in two scenarios it pulled in packages for code that could have been written trivially or mismatched packages when it encountered an issue. I ended up with two different packages to render markdown, on two different pages, doing the same exact style of interpretation of markdown. Craziness that even a junior developer would reject in a pull request.
Weak, Niche or New Tech
Trying to prompt towards niche tech was problematic - hugging face models was seemingly impossible, i couldn't get Cursor to do this at all, it kept insisting on writing me hideously complex regular expressions to attempt to solve the problem i was describing.
Attempting to use Bun and Htmx was also a non-starter, it kept getting hung up on npm and vite and nothing else seemingly mattered and when i really forced the issue it would hallucinate APIs that just don't exist.
Trying to get it to use the native python support within the dotnet 10 preview was impossible, its seemingly too new. Zero success here with more hallucinations of framework items that don't exist.
Context is often a problem
Especially with deployment tech like AWS amplify it tried to ream of hundreds of lines of config which where often nonsensical or broken. I often found myself saying things like "Take into account the entire solution" and then adding every file in a folder into the context to try to get a meaningful output.
The Ugly
Vulnerable recommendations
I asked cursor to hash a password input using industry standards and it decided that i should i should use
crypto.createHash("sha1")
There are numerous problems here; sha1 is weak, no salt, no bcrypt
Rabbit hole nonsense
It was very easy to get stuck where something would have an error because it's hallucinated some method name, and i'd say "thats not a real method" and it would say "ah you are right, lets fix that" and it would go off on some other tangent and rewrite the whole file. Then i would say "you've removed some key functionality, lets reinclude XYZ feature" and we'd end up with the same hallucinated method again. No matter what i did we went round in circles and i ended up doing a git reset head --hard and trying the whole thing again.
Summary
My AI sidekick was no better than a hyped up junior developer eager to rewrite the world. It was able to produce quick and dirty solutions which although great for prototypes lacked the consistency and thought required for robust and performant enterprise grade solutions. It seems great for easy to solve, boilerplate and general grunt work, but don't trust anything it outputs without review.
Closing thoughts
If the current state of AI is at the level of "junior developer". If companies stop hiring junior developers in favour of more seasoned developers with "AI Agents" to assist them, then how do junior developers get hired and skill up to become useful? Do they self learn and over-hype themselves and fake it till they make it? Even then, how do they actually learn? I still remember the pain of learning that very first tech, the struggle was a key part of learning, without it, you will never gain the foundations required to be a skilled developer and problem solver.