We use it daily in our org. What you’re talking about is not happening. That being said, we have fairly decent mono repo structure, bunch of guides/skills to ensure it doesn’t do it that often. Also the whole plan + implement phases.
If it was July 2025, I would have agreed with you. But not anymore.
I used to experience those issues a lot. I haven't in a while. Between having good documentation in my projects, well-defined skills for normal things, simple to use testing tools, and giving it clear requirements things go pretty smoothly.
I'd say it still really depends on what you're doing. Are you working in a poorly documented language that few people use solving problems few people have solved? Are you adding yet another normal-ish kind of feature in a super common language and libraries? One will have a lot more pain than the other, especially if you're not supplying your own docs and testing tools.
There's also just a difference of what to include in the context. I had three different projects which were tightly coupled. AI agents had a hard time keeping things straight as APIs changed between them, constantly misnaming them and getting parameters wrong and what not. Combining them and having one agent work all three repos with a shared set of documentation made it no longer make mistakes when it needed to make changes across multiple projects.
Yes, all the time. Yes, those go to production. AI has improved significantly the past 2 years, I highly recommend you give it another try.
I don't see the behaviour you describe, maybe if your impression is that of online articles or you use a local llama model or ChatGPT from 2 years ago. Claude regularly finds and resolves duplicated code in fact. Let me give you a counter-example: For adding dependencies we run an internal whitelist for AI Agents; new dependencies go through this system, we had similar concerns. I have never seen any agent used in our organisation or at a client, in the half year or so that we run the service, hallucinate a dependency.
So where does your responsibility of this code end ? Do you just push to repo, merge and that's it or do you also deploy, monitor and maintain the production systems? Who handles outages on saturday night, is it you or someone else ?
FWIW I mainly use Opus 4.6 on the $100/mo Max plan, and rarely run into these issues. They certainly occur with lower-tier models, with increased frequency the cheaper the model is - as for someone using it for a significant portion of their professional and personal work, I don’t really understand why this continues to be a widespread issue. Thoroughly vetting Plan Mode output also seems like an easy resolution to this issue, which most devs should be doing anyways IMO (e.g. `npm install random-auth-package`).
For me it's throwaway scripts and tools. Or tools in general. But only simple tools that it can somewhat one-shot. If I ever need to tweak it, I one-shot another tool. If it works, it's fine. No need to know how it works.
If I'm feeling brave, I let it write functions with very clear and well defined input/output, like a well established algorithm. I know it can one-shot those, or they can be easily tested.
But when doing something that I know will be further developed, maintained, I mainly end up writing it by hand. I used to have the LLM write that kind of code as well, but I found it to be slower in the long run.
Definitely a lot of one-shot scripts for a given environment... I've started using a run/ directory for shell scripts that will do things like spin up a set of containers defined in a compose file.. build and test certain sub-projects, initialize a database, etc.
For the most part, many of them work the first time and just continue to do so to aid a project. I've done similar in terms of scaffolding a test/demo environment around a component that I'm directly focused on... sometimes similar for documentation site(s) for gh pages, etc.
I had automation setup for anything I needed for work, gen AI made me feel like I had to babysit a dumb junior developer so I lost interest
Managment uses it to make mock websites then doesn't listen when we point out flows, so nothing new there
Some in digital marketing are using it for data collection/anlysis, but it reaches wrong conclusions 50% of the time (their words) so they are slowly dropping it and using it for meneal tasks and simple automations
In design we had a trial period but has the same issue as coding: either it makes something a senior designer could have made in 2 minutes or it introduces errors that take a long time to fix, to then do it again the next prompt
we are a senior dev team, although relative small, and to me it seems like it only really works as a subsitute for junior devs... but the point of junior devs is to grow someone into a senior with the knowledge you need in the company so i don't really get the usecase overall
removing social media and focusing on the good parts of the internet is the best approach (and was even before AI trend)
you don't need social media, everybody as excuses as to why they're there, but none of them are real
self hosting saved most of the internet for me, from jellyfin for movies TV shows and movies to piped for YouTube
Degoogling and removing big tech from your life also helps a lot, changing from gmail to protonmail was a small change from the outside but it made how I interact with account creation and handling of my data so much more enjoyable
so I dont personally feel this way, but I dont engange with any part of the internet terrorized by AI (and human) slop
one of the few apps not FOSS on my degoogled phone, thought it was time to fix that