Management has decreed that won't happen so it won't.

senko · 2026-01-06T07:33:10 1767684790

What an uncharitable and nasty comment for something they clearly addressed in theirs:

> It is more accurate and consistent than our humans.

So, errors can clearly happen, but they happen less often than they used to.

> It will draft a reply or an email

"draft" clearly implies a human will will double-check.

ptx · 2026-01-06T11:28:26 1767698906

> "draft" clearly implies a human will will double-check.

The wording does imply this, but since the whole point was to free the human from reading all the details and relevant context about the case, how would this double-checking actually happen in reality?

senko · 2026-01-06T15:39:15 1767713955

> the whole point was to free the human from reading all the details and relevant context about the case

That's your assumption.

My read of that comment is that it's much easier to verify and approve (or modify) the message than it is to write it from scratch. The second sentence does confirm a person then modifies it in half the cases, so there is some manual work remaining.

It doesn't need to be all or nothing.

phantasmish · 2026-01-06T14:09:02 1767708542

The “double checking” is a step to make sure there’s someone low-level to blame. Everyone knows the “double-checking” in most of these systems will be cursory at best, for most double-checkers. It’s a miserable job to do much of, and with AI, it’s a lot of what a person would be doing. It’ll be half-assed. People will go batshit crazy otherwise.

On the off chance it’s not for that reason, productivity requirements will be increased until you must half-ass it.

pwagland · 2026-01-08T10:32:16 1767868336

The real question is how do you enforce that the human is reviewing and double-checking?

When the AI gets "good enough", and the review becomes largely rubber stamping, and 50% is pretty close to that, then you run the risk that a good percentage of the reviews are approved without real checks.

This is why nuclear operators and security scanning operators have regular "awareness checks". Is something like this also being done, and if so what is the failure rate of these checks?

JTbane · 2026-01-06T14:49:56 1767710996

I think it's a good comment, given that the best agents seem to hallucinate something like 10% on a simple task and more than 70% on complex ones.

tonyedgecombe · 2026-01-06T11:02:37 1767697357

>So, errors can clearly happen, but they happen less often than they used to.

If you take the comment at face value. I'm sorry but I've been around this industry long enough to be sceptical of self serving statements like these.

>"draft" clearly implies a human will will double-check.

I'm even more sceptical of that working in practice.