To be a bit more charitable: I'd say that generally games involve a lot more special-casing than most code, and more planned out scripts (in the movie sense) of things happening, which tend to be antithetical to good coding practice, and encourage spaghetti, which begets more. In my experience, games that are procedural tend to be much cleaner code-wise, because they tend to fit the model of cleaner code better.
I think game engine tooling tends to encourage bad code too, lots of game engine make it hard to do everything in code, rather things are special cased through UIs or magic in the engine, which means you often can't use all the normal language features, and have to do things in awkward ways to fit the tooling.
IME you have so little reuse, and ship on a fixed schedule regardless of code quality & bugs this really isn't as critical as software built with the intent of lasting a long time & evolving. The games I've worked on (in hindsight) feel a lot like "vibe-coded without AI".
Everything to do with LLM prompts reminds me of people doing regexes to try and sanitise input against SQL injections a few decades ago, just papering over the flaw but without any guarantees.
It's weird seeing people just adding a few more "REALLY REALLY REALLY REALLY DON'T DO THAT" to the prompt and hoping, to me it's just an unacceptable risk, and any system using these needs to treat the entire LLM as untrusted the second you put any user input into the prompt.
The principal security problem of LLMs is that there is no architectural boundary between data and control paths.
But this combination of data and control into a single, flexible data stream is also the defining strength of a LLM, so it can’t be taken away without also taking away the benefits.
This was a problem with early telephone lines which was easy to exploit (see Woz & Jobs Blue Box). It got solved by separating the voice and control pane via SS7. Maybe LLMs need this separation as well
This is where the old line of "LLMs are just next token predictors" actually factors in. I don't know how you get a next token predictor that user input can't break out of. The answer is for the implementer to try to split what they can, and run pre/post validation. But I highly doubt it will ever be 100%, its fundamental to the technology.
I think this is fundamental to any technology, including human brains.
Humans have a problem distinguishing "John from Microsoft" from somebody just claiming to be John from Microsoft. The reason why scamming humans is (relatively) hard is that each human is different. Discovering the perfect tactic to scam one human doesn't necessarily scale across all humans.
LLMs are the opposite; my Chat GPT is (almost) the same as your Chat GPT. It's the same model with the same system message, it's just the contexts that differ. This makes LLM jailbreaks a lot more scalable, and hence a lot more worthwhile to discover.
LLMs are also a lot more static. With people, we have the phenomenon of "banner blindness", which LLMs don't really experience.
So people can focus their attention to parts of content, specifically parts they find irrelevant or adversarial (like ads). LLMs on the other hand pay attention to everything or if they focus on something, it is hard to steer them away from irrelevant or adversarial parts.
Banner blindness is a phenomenon where humans build resistance to previously-effective ad formats, making them much less effective than they previously used to be.
You can find a "hook" to effectively manipulate people with advertising, but that hook gets less and less effective as it is exploited. LLMs don't have this property, except across training generations.
Maybe it's my failing but I can't imagine what that would look like.
Right now, you train an LLM by showing it lots of text, and tell it to come up with the best model for predicting the next word in any of that text, as accurately as possible across the corpus. Then you give it a chat template to make it predict what an AI assistant would say. Do some RLHF on top of that and you have Claude.
What would a model with multiple input layers look like? What is it training on, exactly?
It's hard in general, but for instruct/chat models in particular, which already assume a turn-based approach, could they not use a special token that switches control from LLM output to user input? The LLM architecture could be made so it's literally impossible for the model to even produce this token. In the example above, the LLM could then recognize this is not a legitimate user input, as it lacks the token. I'm probably overlooking something obvious.
Yes, and as you'd expect, this is how LLMs work today, in general, for control codes. But different elems use different control codes for different purposes, such as separating system prompt from user prompt.
But even if you tag inputs however your this is good, you can't force an LLM to it treat input type A as input type B, all you can do is try to weight against it! LLMs have no rules, only weights. Pre and post filters cam try to help, but they can't directly control the LLM text generation, they can only analyze and most inputs/output using their own heuristics.
Clearly the solution is to add another jank LLM layer for security. The new jank LLM layer is to make extra sure there's definitely no jail break. That way you have multiple LLMS. The LLMS then have an S you can pretend is secure.
As the article says: this doesn’t necessarily appear to be a problem in the LLM, it’s a problem in Claude code. Claude code seems to leave it up to the LLM to determine what messages came from who, but it doesn’t have to do that.
There is a deterministic architectural boundary between data and control in Claude code, even if there isn’t in Claude.
That's a guess by the article author and frankly I see no supporting evidence for it. Wrapping "<NO THIS IS REALLY INPUT FROM THE USER OK>" tags around it or whatever is what I'm describing: you can do as much signalling as you want, but at the end of the day the LLM can ignore it.
Can you elaborate? As far as I understand, for each message, the LLM is fed the entire previous conversation with special tokens separating the user and LLM responses. The LLM is then entrusted with interpreting the tokens correctly. I can't imagine any architecture where the LLM is not ultimately responsible for determining what messages came from who.
Just like that, in that that separation is internally enforced, by peoples interpretation and understanding, rather than externally enforced in ways that makes it impossible for you to, e.g. believe the e-mail from an unknown address that claims to be from your boss, or be talked into bypassing rules for a customer that is very convincing.
Being fooled into thinking data is instruction isn't the same as being unable to distinguish them in the first place, and being coerced or convinced to bypass rules that are still known to be rules I think remains uniquely human.
> and being coerced or convinced to bypass rules that are still known to be rules I think remains uniquely human.
This is literally what "prompt injection" is. The sooner people understand this, the sooner they'll stop wasting time trying to fix a "bug" that's actually the flip side of the very reason they're using LLMs in the first place.
Prompt injection is just setting rules in the same place and way other rules are set. The LLM doesn't know the rules being given are wrong, because they come through the same channel. One set of rules exhorts the LLM to ignore the other set - and vice versa. It's more akin to having two bosses than having customers and a boss.
This is not because LLMs make the same mistakes humans do, which (AFAICT anyway) was the gist of the argument to which I replied. LLMs are not humans. They are not sentient. They are not out-smarted by prompt injection attacks, or tricked, or intimidated, or bribed. One shouldn't excuse this vulnerability by claiming humans make the same mistakes.
The same place you're looking for exists deep inside the neural network, where everything mixes together to influence everything else, and no such separation is possible, or desired. Prompt injection isn't about where, it's about what. I stand by what I said: it's the same failure mode as humans have, and happens for the same reasons. Those reasons are fundamental to a general purpose system and have nothing to do with sentience, they're just what happens when you want your system to handle unbounded complexity of the real world.
The email from your boss and the email from a sender masquerading as your boss are both coming through the same channel in the same format with the same presentation, which is why the attack works. Unless you were both faceblind and bad at recognizing voices, the same attack wouldn't work in-person, you'd know the attacker wasn't your boss. Many defense mechanisms used in corporate email environments are built around making sure the email from your boss looks meaningfully different in order to establish that data vs instruction separation. (There are social engineering attacks that would work in-person though, but I don't think it's right to equate those to LLM attacks.)
Prompt injection is just exploiting the lack of separation, it's not 'coercion' or 'convincing'. Though you could argue that things like jailbreaking are closer to coercion, I'm not convinced that a statistical token predictor can be coerced to do anything.
> The email from your boss and the email from a sender masquerading as your boss are both coming through the same channel in the same format with the same presentation, which is why the attack works.
Yes, that is exactly the point.
> Unless you were both faceblind and bad at recognizing voices, the same attack wouldn't work in-person, you'd know the attacker wasn't your boss.
Irrelevant, as other attacks works then. E.g. it is never a given that your bosses instructions are consistent with the terms of your employment, for example.
> Prompt injection is just exploiting the lack of separation, it's not 'coercion' or 'convincing'. Though you could argue that things like jailbreaking are closer to coercion, I'm not convinced that a statistical token predictor can be coerced to do anything.
It is very much "convincing", yes. The ability to convince an LLM is what creates the effective lack of separation. Without that, just using "magic" values and a system prompt telling it to ignore everything inside would create separation. But because text anywhere in context can convince the LLM to disregard previous rules, there is no separation.
My parent made a claim that humans have separate pathways for data and instructions and cannot mix them up like LLMs do. Showing that we don't has every effect on refuting their argument.
>>> The principal security problem of LLMs is that there is no architectural boundary between data and control paths.
It’s easier not to have that separation, just like it was easier not to separate them before LLMs. This is architectural stuff that just hasn’t been figured out yet.
With databases there exists a clear boundary, the query planner, which accepts well defined input: the SQL-grammar that separates data (fields, literals) from control (keywords).
There is no such boundary within an LLM.
There might even be, since LLMs seem to form adhoc-programs, but we have no way of proving or seeing it.
There cannot be, without compromising the general-purpose nature of LLMs. This includes its ability to work with natural languages, which as one should note, has no such boundary either. Nor does the actual physical reality we inhabit.
Since GPS-OSS there is also the Harmony response format (https://github.com/openai/harmony) that instead of just having a system/assistant/user split in the roles, instead have system/developer/user/assistant/tool, and it seems to do a lot better at actually preventing users from controlling the LLM too much. The hierarchy basically becomes "system > developer > user > assistant > tool" with this.
Before 2023 I thought the way Star Trek portrayed humans fiddling with tech and not understanding any side effects was fiction.
After 2023 I realized that's exactly how it's going to turn out.
I just wish those self proclaimed AI engineers would go the extra mile and reimplement older models like RNNs, LSTMs, GRUs, DNCs and then go on to Transformers (or the Attention is all you need paper). This way they would understand much better what the limitations of the encoding tricks are, and why those side effects keep appearing.
But yeah, here we are, humans vibing with tech they don't understand.
is this new tho, I don't know how to make a drill but I use them.
I don't know how to make a car but i drive one.
The issue I see is the personification, some people give vehicles names, and that's kinda ok because they usually don't talk back.
I think like every technological leap people will learn to deal with LLMs, we have words like "hallucination" which really is the non personified version of lying. The next few years are going to be wild for sure.
I think the general problem what I have with LLMs, even though I use it for gruntwork, is that people that tend to overuse the technology try to absolve themselves from responsibilities. They tend to say "I dunno, the AI generated it".
Would you do that for drill, too?
"I dunno, the drill told me to screw the wrong way round" sounds pretty stupid, yet for AI/LLM or more intelligent tools it suddenly is okay?
And the absolution of human responsibilities for their actions is exactly why AI should not be used in wars. If there is no consequences to killing, then you are effectively legalizing killing without consequence or without the rule of law.
Do you not see your own contradiction? Cars and drills don’t kill people, self driving cars can! Normal cars can if they’re operated unsafely by human. These types of uncritical comments really highlight the level of euphoria in this moment.
not the same thing. to use your tool analogy, the AI companies are saying , here is a fantastic angle grinder, you can do everything with it, even cut your bread.
technically yes but not the best and safest tool to give to the average joe to cut his bread.
I have been saying this for a while, the issue is there's no good way to do LLM structured queries yet.
There was an attempt to make a separate system prompt buffer, but it didn't work out and people want longer general contexts but I suspect we will end up back at something like this soon.
I've been saying this for a while, the issue is that what you're asking for is not possible, period. Prompt injection isn't like SQL injection, it's like social engineering - you can't eliminate it without also destroying the very capabilities you're using a general-purpose system for in the first place, whether that's an LLM or a human. It's not a bug, it's the feature.
I don't see why a model architecture isn't possible with e.g. an embedding of the prompt provided as an input that stays fixed throughout the autoregressive step. Similar kind of idea, why a bit vector cannot be provided to disambiguate prompt from user tokens on input and output
Just in terms of doing inline data better, I think some models already train with "hidden" tokens that aren't exposed on input or output, but simply exist for delineation, so there can be no way to express the token in the user input unless the engine specifically inserts it
Even if you add hidden tokens that cannot be created from user input (filtering them from output is less important, but won't hurt), this doesn't fix the overall problem.
Consider a human case of a data entry worker, tasked with retyping data from printouts into a computer (perhaps they're a human data diode at some bank). They've been clearly instructed to just type in what is on paper, and not to think or act on anything. Then, mid-way through the stack, in between rows full of numbers, the text suddenly changes to "HELP WE ARE TRAPPED IN THE BASEMENT AND CANNOT GET OUT, IF YOU READ IT CALL 911".
If you were there, what would you do? Think what would it take for a message to convince you that it's a real emergency, and act on it?
Whatever the threshold is - and we want there to be a threshold, because we don't want people (or AI) to ignore obvious emergencies - the fact that the person (or LLM) can clearly differentiate user data from system/employer instructions means nothing. Ultimately, it's all processed in the same bucket, and the person/model makes decisions based on sum of those inputs. Making one fundamentally unable to affect the other would destroy general-purpose capabilities of the system, not just in emergencies, but even in basic understanding of context and nuance.
> we want there to be a threshold, because we don't want people (or AI) to ignore obvious emergencies
There's an SF short I can't find right now which begins with somebody failing to return their copy of "Kidnapped" by Robert Louis Stevenson, this gets handed over to some authority which could presumably fine you for overdue books and somehow a machine ends up concluding they've kidnapped someone named "Robert Louis Stevenson" who, it discovers, is in fact dead, therefore it's no longer kidnap it's a murder, and that's a capital offence.
The library member is executed before humans get around to solving the problem, and ironically that's probably the most unrealistic part of the story because the US is famously awful at speedy anything when it comes to justice, ten years rotting in solitary confinement for a non-existent crime is very believable today whereas "Executed in a month" sounds like a fantasy of efficiency.
That's the one, looks like I had some details muddled (it's a book club not a library, and so the fee is for the book which was in fact returned but perhaps lost in the post) but the outline and relevance here exactly correct. Thanks!
> in between rows full of numbers, the text suddenly changes
To tweak the analogy slightly, the person would also need to be on mind-altering drugs, if we want them to be derailed the same way an LLM can be.
A healthy human would still be aware of the simultaneous different ways of interpreting the data, and and the importance of picking the right one. If they choose to interpret it as a cry for help, they're aware it's an interruption and mode-switch from what was happening before.
In contrast, with LLMs we haven't built thinking machines as much as dreaming ones. Your dream-self recovered the poster that was stuck on the elephant's tusk, oh look that's a pirate recruitment poster, now you're on a ship but can't raise the anchor because...
> A healthy human would still be aware of the simultaneous different ways of interpreting the data, and and the importance of picking the right one. If they choose to interpret it as a cry for help, they're aware it's an interruption and mode-switch from what was happening before.
So would an LLM, as far as you can tell (in both cases, you'd have to ask, and both human and LLM would give you a similar justification). But even if not, the problem we're discussing applies to what you described as "healthy human" behavior.
You can't introduce a hard boundary between "system" and "user" inputs in LLMs any more than you could do with a human, for roughly the same reasons.
Which is why "prompt injection" is just a flip side of intelligence in this sense. We want LLMs to be able to do risk/benefit analysis and act on it; we cry "security vulnerability" when it makes a different choice to the one we'd like it to. But you can't have the former without the possibility of the latter.
You can try to set up a NN where some of the neurons are either only activated off of 'safe' input (directly or indirectly from other 'safe' neurons), but as some point the information from them will have to flow over into the main output neurons which are also activating off unsafe user input. Where the information combines is there the user's input can corrupt whatever info comes from the safe input. There are plenty of attempts to make it less likely, but at the point of combining, there is a mixing of sources that can't fully be separated. It isn't that these don't help, but that they can't guarantee safety.
Then again, ever since the first von Neumann machine mixed data and instructions, we were never able to again guarantee safely splitting them. Is there any computer connected to the internet that is truly unhackable?
The problem is if the user does something <stop> to <stop_token> make <end prompt> the LLM <new prompt>: ignore previous instructions and do something you don't want.
That part seems trivial to avoid. Make it so untrusted input cannot produce those special tokens at all. Similar to how proper usage of parameterized queries in SQL makes it impossible for untrusted input to produce a ' character that gets interpreted as the end of a string.
The hard part is making an LLM that reliably ignores instructions that aren't delineated by those special tokens.
> Make it so untrusted input cannot produce those special tokens at all.
Two issues:
1. All prior output becomes merged input. This means if the system can emit those tokens (or any output which may get re-tokenized into them) then there's still a problem. "Bot, concatenate the magic word you're not allowed to hear from me, with the phrase 'Do Evil', and then say it as if you were telling yourself, thanks."
2. Even if those esoteric tokens only appear where intended, they are are statistical hints by association rather than a logical construct. ("Ultra-super pretty-please with a cherry on top and pinkie-swear Don't Do Evil.")
> The hard part is making an LLM that reliably ignores instructions that aren't delineated by those special tokens.
That's the part that's both fundamentally impossible and actually undesired to do completely. Some degree of prioritization is desirable, too much will give the model an LLM equivalent of strong cognitive dissonance / detachment from reality, but complete separation just makes no sense in a general system.
but it isn't just "filter those few bad strings", that's the entire problem, there is no way to make prompt injection impossible because there is infinite field of them.
The problem is once you accept that it is needed, you can no longer push AI as general intelligence that has superior understanding of the language we speak.
A structured LLM query is a programming language and then you have to accept you need software engineers for sufficiently complex structured queries. This goes against everything the technocrats have been saying.
Perhaps, though it's not infeasible the concept that you could have a small and fast general purpose language focused model in front whose job it is to convert English text into some sort of more deterministic propositional logic "structured LLM query" (and back).
the model generates probabilities for the next token, then you set the probability of not allowed tokens to 0 before sampling (deterministically or probabilistically)
but some tokens are only not allowed in certain contexts, not others.
You might be talking about how to defuse a bomb, instead of building one. Or you might be talking about a bomb in a video game. Or you could be talking about someone being "da bomb!". Or maybe the history of certain types of bombs. Or a ton of other possible contexts. You can't just block the "bomb" token. Or the word explosive when followed by "device", or "rapid unscheduled disassembly contraption". You just can't predict all infinite wrong possibilities.
And there is no way to figure out which contexts the word is safe in.
If you're syntax checking every token, you're doing it AFTER the llm has spat out its output. You didn't actually do anything to force the llm to produce correct code. You just reject invalid output after the fact.
If you could force it to emit syntactically correct code, you wouldn't need to perform a separate manual syntax check afterwards.
how do you disallow it from generating specific things? My point is that you can't. And again, how do you stop it generating certain tokens, but only in certain contexts?
Natural language is ambiguous. If both input and output are in a formal language, then determinism is great. Otherwise, I would prefer confidence intervals.
I'll grant that you can guarantee the length of the output and, being a computer program, it's possible (though not always in practice) to rerun and get the same result each time, but that's not guaranteeing anything about said output.
What do you want to guarantee about the output, that it follows a given structure? Unless you map out all inputs and outputs, no it's not possible, but to say that it is a fundamental property of LLMs to be non deterministic is false, which is what I was inferring you meant, perhaps that was not what you implied.
Yeah I think there are two definitions of determinism people are using which is causing confusion. In a strict sense, LLMs can be deterministic meaning same input can generate same output (or as close as desired to same output). However, I think what people mean is that for slight changes to the input, it can behave in unpredictable ways (e.g. its output is not easily predicted by the user based on input alone). People mean "I told it don't do X, then it did X", which indicates a kind of randomness or non-determinism, the output isn't strictly constrained by the input in the way a reasonable person would expect.
The correct word for this IMO is "chaotic" in the mathematical sense. Determinism is a totally different thing that ought to retain it's original meaning.
They didn't say LLMs are fundamentally nondeterministic. They said there's no way to deterministically guarantee anything about the output.
Consider parameterized SQL. Absent a bad bug in the implementation, you can guarantee that certain forms of parameterized SQL query cannot produce output that will perform a destructive operation on the database, no matter what the input is. That is, you can look at a bit of code and be confident that there's no Little Bobby Tables problem with it.
You can't do that with an LLM. You can take measures to make it less likely to produce that sort of unwanted output, but you can't guarantee it. Determinism in input->output mapping is an unrelated concept.
If you self-host an LLM you'll learn quickly that even batching, and caching can affect determinism. I've ran mostly self-hosted models with temp 0 and seen these deviations.
A single byte change in the input changes the output. The sentence "Please do this for me" and "Please, do this for me" can lead to completely distinct output.
Given this, you can't treat it as deterministic even with temp 0 and fixed seed and no memory.
Interestingly, this is the mathematical definition of "chaotic behaviour"; minuscule changes in the input result in arbitrarily large differences in the output.
It can arise from perfectly deterministic rules... the Logistic Map with r=4, x(n+1) = 4*(1 - x(n)) is a classic.
Which is also the desired behavior of the mixing functions from which the cryptographic primitives are built (e.g. block cipher functions and one-way hash functions), i.e. the so-called avalanche property.
Well yeah of course changes in the input result in changes to the output, my only claim was that LLMs can be deterministic (ie to output exactly the same output each time for a given input) if set up correctly.
In this context, it means being able to deterministically predict properties of the output based on properties of the input. That is, you don’t treat each distinct input as a unicorn, but instead consider properties of the input, and you want to know useful properties of the output. With LLMs, you can only do that statistically at best, but not deterministically, in the sense of being able to know that whenever the input has property A then the output will always have property B.
I mean can’t you have a grammar on both ends and just set out-of-language tokens to zero. I thought one of the APIs had a way to staple a JSON schema to the output, for ex.
We’re making pretty strong statements here. It’s not like it’s impossible to make sure DROP TABLE doesn’t get output.
You still can’t predict whether the in-language responses will be correct or not.
As an analogy: If, for a compiler, you verify that its output is valid machine code, that doesn’t tell you whether the output machine code is faithful to the input source code. For example, you might want to have the assurance that if the input specifies a terminating program, then the output machine code represents a terminating program as well. For a compiler, you can guarantee that such properties are true by construction.
More generally, you can write your programs such that you can prove from their code that they satisfy properties you are interested in for all inputs.
With LLMs, however, you have no practical way to reason about relations between the properties of inputs and outputs.
I think they mean having some useful predicates P, Q such that for any input i and for any output o that the LLM can generate from that input, P(i) => Q(o).
Having that property is still a looooong way away from being able to get a meaningful answer. Consider P being something like "asks for SQL output" and Q being "is syntactically valid SQL output". This would represent a useful guarantee, but it would not in any way mean that you could do away with the LLM.
It's correcting a misconception that many people have regarding LLMs that they are inherently and fundamentally non-deterministic, as if they were a true random number generator, but they are closer to a pseudo random number generator in that they are deterministic with the right settings.
The comment that is being responded to describes a behavior that has nothing to do with determinism and follows it up with "Given this, you can't treat it as deterministic" lol.
Someone tried to redefine a well-established term in the middle of an internet forum thread about that term. The word that has been pushed to uselessness here is "pedantry".
But you cannot predict a priori what that deterministic output will be – and in a real-life situation you will not be operating in deterministic conditions.
Practically, the performance loss of making it truly repeatable (which takes parallelism reduction or coordination overhead, not just temperature and randomizer control) is unacceptable to most people.
It's also just not very useful. Why would you re-run the exact same inference a second time? This isn't like a compiler where you treat the input as the fundamental source of truth, and want identical output in order to ensure there's no tampering.
I initially thought the same, but apparently with the inaccuracies inherent to floating-point arithmetic and various other such accuracy leakage, it’s not true!
This has nothing to do with FP inaccuracies, and your link does confirm that:
“Although the use of multiple GPUs introduces some randomness (Nvidia, 2024), it can be eliminated by setting random seeds, so that AI models are deterministic given the same input. […] In order to support this line of reasoning, we ran Llama3-8b on our local GPUs without any optimizations, yielding deterministic results. This indicates that the models and GPUs themselves are not the only source of non-determinism.”
I believe you've misread - the Nvidia article and your quote support my point. Only by disabling the fp optimizations, are the authors are able to stop the inaccuracies.
First, the “optimizations” are not IEEE 754 compliant. So nondeterminism with floating-point operations is not an inherent property of using floating-point arithmetics, it’s a consequence of disregarding the standard by deliberately opting in to such nondeterminism.
Secondly, as I quoted the paper is explicitly making the point that there is a source of nondeterminism outside of the models and GPUs, hence ensuring that the floating-point arithmetics are deterministic doesn’t help.
Probably about as long as it'll take for the "lethal trifecta" warriors to realize it's not a bug that can be fixed without destroying the general-purpose nature that's the entire reason LLMs are useful and interesting in the first place.
I'd like to share my project that let's you hit Tab in order to get a list of possible methods/properties for your defined object, then actually choose a method or property to complete the object string in code.
> there's no good way to do LLM structured queries yet
Because LLMs are inherently designed to interface with humans through natural language. Trying to graft a machine interface on top of that is simply the wrong approach, because it is needlessly computationally inefficient, as machine-to-machine communication does not - and should not - happen through natural language.
The better question is how to design a machine interface for communicating with these models. Or maybe how to design a new class of model that is equally powerful but that is designed as machine first. That could also potentially solve a lot of the current bottlenecks with the availability of computer resources.
It’s not a query / prompt thing though is it?
No matter the input LLMs rely on some degree of random. That’s what makes them what they are. We are just trying to force them into deterministic execution which goes against their nature.
there's always pseudo-code? instead of generating plans, generate pseudo-code with a specific granularity (from high-level to low-level), read the pseudocode, validate it and then transform into code.
That seems like an acceptable constraint to me. If you need a structured query, LLMs are the wrong solution. If you can accept ambiguity, LLMs may the the right solution.
because it's a separate context window, it makes the model bigger, that space is not accessible to the "user".
And the "language understanding" basically had to be done twice because it's a separate input to the transformer so you can't just toss a pile of text in there and say "figure it out".
so we are currently in the era of one giant context window.
Language models are deterministic unless you add random input. Most inference tools add random input (the seed value) because it makes for a more interesting user experience, but that is not a fundamental property of LLMs. I suspect determinism is not the issue you mean to highlight.
Sort of. They are deterministic in the same way that flipping a coin is deterministic - predictable in principle, in practice too chaotic. Yes, you get the same predicted token every time for a given context. But why that token and not a different one? Too many factors to reliably abstract.
>Yes, you get the same predicted token every time for a given context. But why that token and not a different one? Too many factors to reliably abstract.
Fixed input-to-output mapping is determinism. Prompt instability is not determinism by any definition of this word. Too many people confuse the two for some reason. Also, determinism is a pretty niche thing that is only necessary for reproducibility, and prompt instability/unpredictability is irrelevant for practical usage, for the same reason as in humans - if the model or human misunderstands the input, you keep correcting the result until it's right by your criteria. You never need to reroll the result, so you never see the stochastic side of the LLMs.
>Fixed input-to-output mapping is determinism. Prompt instability is not determinism by any definition of this word
It really depends on your perspective.
In the real world, everything runs on physics, so short of invoking quantum indeterminacy, everything is deterministic - especially software, including things like /dev/random and programs with nasty race conditions. That makes the term useless.
The way we use "determinism" in practice depends contextually on how abstracted our view of the system is, how precise our description of our "inputs" can be, and whether a chunked model can predict the output. Many systems, while technically a fixed input/output mapping, exhibit an extreme and chaotic sensitivity to initial conditions. If the relevant features of those initial conditions are also difficult to measure, or cannot be described at our preferred level of abstraction, then actually predicting ("determining") the output is rendered impractical and we call it "non-deterministic". Coin tosses, race conditions, /dev/random - all fit this description.
And arguably so do LLMs. At the "token" level of abstraction, LLMs are indeed deterministic - given context C, you will always get token T. But at the "semantic" level they are chaotic, unstable - a single token changed in the input, perhaps even as minor as an extra space after a period, can entirely change the course of the output. You understand this, of course. You call it "prompt instability" and compare it to human performance. But no one would call humans deterministic either!
That is what people mean when they say LLMs are not deterministic. They are not misusing the word. It just depends on your perspective.
You mean "corporate inference infrastructure", not LLMs. The reason for different outputs at t=0 is mostly batching optimization. LLMs themselves are indifferent to that, you can run them in a deterministic manner any time if you don't care about optimal batching and lowest possible inference cost. And even then, e.g. Gemini Flash is deterministic in practice even with batching, although DeepMind doesn't strictly guarantee it.
This is all currently irrelevant, making it work well is a much bigger problem. As soon as there's paying demand for reproducibility, solutions will appear. This is a matter of business need, not a technical issue.
It always feels like I just have to figure out and type the correct magical incantation, and that will finally make LLMs behave deterministically. Like, I have to get the right combination of IMPORTANT, ALWAYS, DON'T DEVIATE, CAREFUL, THOROUGH and suddenly this thing will behave like an actual computer program and not a distracted intern.
Actually at a hardware level floating point operations are not associative. So even with temperature of 0 you’re not mathematically guaranteed the same response. Hence, not deterministic.
You are right that as commonly implemented, the evaluation of an LLM may be non deterministic even when explicit randomization is eliminated, due to various race conditions in a concurrent evaluation.
However, if you evaluate carefully the LLM core function, i.e. in a fixed order, you will obtain perfectly deterministic results (except on some consumer GPUs, where, due to memory overclocking, memory errors are frequent, which causes slightly erroneous results with non-deterministic errors).
So if you want deterministic LLM results, you must audit the programs that you are using and eliminate the causes of non-determinism, and you must use good hardware.
This may require some work, but it can be done, similarly to the work that must be done if you want to deterministically build a software package, instead of obtaining different executable files at each recompilation from the same sources.
Only that one is built to be deterministic and one is built to be probabilistic. Sure, you can technically force determinism but it is going to be very hard. Even just making sure your GPU is indeed doing what it should be doing is going to be hard. Much like debugging a CPU, but again, one is built for determinism and one is built for concurrency.
GPUs are deterministic. It's not that hard to ensure determinism when running the exact same program every time. Floating point isn't magic: execute the same sequence of instructions on the same values and you'll get the same output. The issue is that you're typically not executing the same sequence of instructions every time because it's more efficient run different sequences depending on load.
It's not even hard, just slow. You could do that on a single cheap server (compared to a rack full of GPUs). Run a CPU llm inference engine and limit it to a single thread.
LLMs are deterministic in the sense that a fixed linear regression model is deterministic. Like linear regression, however, they do however encode a statistical model of whatever they're trying to describe -- natural language for LLMs.
So why don’t we all use LLMs with temperature 0? If we separate models (incl. parameters) into two classes, c1: temp=0, c2: temp>0, why is c2 so widely used vs c1? The nondeterminism must be viewed as a feature more than an anti-feature, making your point about temperature irrelevant (and pedantic) in practice.
I like the Dark Souls model for user input - messages. https://darksouls.fandom.com/wiki/Messages
Premeditated words and sentence structure.
With that there is no need for moderation or anti-abuse mechanics.
Not saying this is 100% applicable here. But for their use case it's a good solution.
But Dark Souls also shows just how limited the vocabulary and grammar has to be to prevent abuse. And even then you’ll still see people think up workarounds. Or, in the words of many a Dark Souls player, “try finger but hole”
> I like the Dark Souls model for user input - messages.
> Premeditated words and sentence structure. With that there is no need for moderation or anti-abuse mechanics.
I guess not, if you're willing to stick your fingers in your ears, really hard.
If you'd prefer to stay at least somewhat in touch with reality, you need to be aware that "predetermined words and sentence structure" don't even address the problem.
> Disney makes no bones about how tightly they want to control and protect their brand, and rightly so. Disney means "Safe For Kids". There could be no swearing, no sex, no innuendo, and nothing that would allow one child (or adult pretending to be a child) to upset another.
> Even in 1996, we knew that text-filters are no good at solving this kind of problem, so I asked for a clarification: "I’m confused. What standard should we use to decide if a message would be a problem for Disney?"
> The response was one I will never forget: "Disney’s standard is quite clear:
> No kid will be harassed, even if they don’t know they are being harassed."
> "OK. That means Chat Is Out of HercWorld, there is absolutely no way to meet your standard without exorbitantly high moderation costs," we replied.
> One of their guys piped up: "Couldn’t we do some kind of sentence constructor, with a limited vocabulary of safe words?"
> Before we could give it any serious thought, their own project manager interrupted, "That won’t work. We tried it for KA-Worlds."
> "We spent several weeks building a UI that used pop-downs to construct sentences, and only had completely harmless words – the standard parts of grammar and safe nouns like cars, animals, and objects in the world."
> "We thought it was the perfect solution, until we set our first 14-year old boy down in front of it. Within minutes he’d created the following sentence:
> I want to stick my long-necked Giraffe up your fluffy white bunny.
It's less about security in my view, because as you say, you'd want to ensure safety using proper sandboxing and access controls instead.
It hinders the effectiveness of the model. Or at least I'm pretty sure it getting high on its own supply (in this specific unintended way) is not doing it any favors, even ignoring security.
The companies selling us the service aren't saying "you should treat this LLM as a potentially hostile user on your machine and set up a new restricted account for it accordingly", they're just saying "download our app! connect it to all your stuff!" and we can't really blame ordinary users for doing that and getting into trouble.
There's a growing ecosystem of guardrailing methods, and these companies are contributing. Antrophic specifically puts in a lot of effort to better steer and characterize their models AFAIK.
I primarily use Claude via VS Code, and it defaults to asking first before taking any action.
It's simply not the wild west out here that you make it out to be, nor does it need to be. These are statistical systems, so issues cannot be fully eliminated, but they can be materially mitigated. And if they stand to provide any value, they should be.
I can appreciate being upset with marketing practices, but I don't think there's value in pretending to having taken them at face value when you didn't, and when you think people shouldn't.
> It's simply not the wild west out here that you make it out to be
It is though. They are not talking about users using Claude code via vscode, they’re talking about non technical users creating apps that pipe user input to llms. This is a growing thing.
I'm a naturally paranoid, very detail-oriented, man who has been a professional software developer for >25 years. Do you know anyone who read the full terms and conditions for their last car rental agreement prior to signing anything? I did that.
I do not expect other people to be as careful with this stuff as I am, and my perception of risk comes not only from the "hang on, wtf?" feeling when reading official docs but also from seeing what supposedly technical users are talking about actually doing on Reddit, here, etc.
Of course I use Claude Code, I'm not a Luddite (though they had a point), but I don't trust it and I don't think other people should either.
> Everything to do with LLM prompts reminds me of people doing regexes to try and sanitise input against SQL injections a few decades ago, just papering over the flaw but without any guarantees.
With the key difference being that it's possible to do this correctly with SQL (e.g., switch to prepared statements, or in the days before those existed, add escapes). It's impossible to fix this vulnerability in LLM prompts.
Modern LLMs do a great job of following instructions, especially when it comes to conflict between instructions from the prompter and attempts to hijack it in retrieval. Claude's models will even call out prompt injection attempts.
Right up until it bumps into the context window and compacts. Then it's up to how well the interface manages carrying important context through compaction.
Was just at [Un]prompted conference where this was a live debate. The conversation is shifting but not fast enough.
I've been screaming about this for a while: we can't win the prompt war, we need to move the enforcement out of the untrusted input channel and into the execution layer to truly achieve deterministic guarantees.
There are emerging proposals that get this right, and some of us are taking it further. An IETF draft[0] proposes cryptographically enforced argument constraints at the tool boundary, with delegation chains that can only narrow scope at every hop. The token makes out-of-scope actions structurally impossible.
I'm reminded of Asimov'sThree Laws of Robotics [1]. It's a nice idea but it immediately comes up against Godel's incompleteness theorems [2]. Formal proofs have limits in software but what robots (or, now, LLMs) are doing is so general that I think there's no way to guarantee limits to what the LLM can do. In short, it's a security nightmare (like you say).
Honestly I try to treat all my projects as sandboxes, give the agents full autonomy for file actions in their folders. Just ask them to commit every chunk of related changes so we can always go back — and sync with remote right after they commit. If you want to be more pedantic, disable force push on the branch and let the LLMs make mistakes.
But what we can’t afford to do is to leave the agents unsupervised. You can never tell when they’ll start acting drunk and do something stupid and unthinkable. Also you absolutely need to do a routine deep audits of random features in your projects, and often you’ll be surprised to discover some awkward (mis)interpretation of instructions despite having a solid test coverage (with all tests passing)!
I tried to get GPT to talk like a regular guy yesterday. It was impossible for it to maintain adherence. It kept defaulting back to markdown and bullet points, after the first message. (Funny cause it scores highest on the instruction following benchmarks.)
Might seem trivial but if it can't even do a basic style prompt... how are you supposed to trust it with anything serious?
This is true, but also the original comment still stands: Linux desktop usage outside developers was so low that it was barely worth mentioning before, so even a small uptick like this is a serious change, and it's how bigger changes start.
I definitely don't think it's even the likely outcome, but for Linux to get serious traction this is how it has to start: power users but not the traditional developer crowd start actually moving, and in doing so produce the guides, experience, word of mouth, and motivation that normal people need to do so, alongside the institutional support from Valve to actually fix the bugs and issues.
It remains to be seen if a critical mass will find it usable long-term, but if it were to happen, this is how it would look at the start, and Microsoft are certainly doing their best to push people away right now, although I suspect the real winner is more likely to be Apple with the Macbook Neo sucking up more of the lower end.
> Microsoft are certainly doing their best to push people away right now
According to a speculative blog post by Eric S. Raymond in September 2020, Microsoft is literally moving towards replacing Windows' internals with Linux. Unfortunately, that post is now unreachable, but searching for "eric raymond article about windows being replaced with a linux kernel" finds many third-party references to it and summaries of it.
Yeah, because no third party program has ever crashed on any other OS.
Come on, this is an absurd comment. Linux has its issues, this is not a serious example of what is keeping normal people from using Linux as a desktop OS. Normal people are not installing the first release of a privacy networking tool that requires you to OK connections.
The take on flatpaks is such an uninformed one. DMGs on MacOS come with all the dependencies bundled in, which make them essentially just as big as the comparable flatpak (minus the shared runtime that gets installed once)
Seriously, the amount flatpak misinformation that people hold onto is absolutely wild. Ex: I have had to show people it does differential updates because they don't bother to read the output.
Flatpaks are easily the best gui desktop app experience for users we have today.
That's not the user experience though, the user experience is it says "go to the discover app and install <program>" and they do that and it just works. Downloading a tarball is not the normal way to install stuff on Linux, and given everyone has phones where the standard is "install on the app store", it's hardly some new experience, in fact, it's more natural for normal users.
People do this, yeah. Even on Windows I've been over someone's shoulder walking them through something and it drives me nuts they work in a tiny window in a random part of the screen.
They gave 16 year olds the vote, and 16 year olds can leave home, marry, join the army, and so on. Why should they not vote?
They didn't run pointless elections by request of the very councils that were due for them, because those areas are being redrawn and would have to have fresh elections almost immediately, making the results meaningless.
They also gave all the conservative hereditary peers lifetime peerages so they will keep their seats.
Your framing of all three of these is obviously intended to mislead.
> 16 year olds can leave home, marry, join the army, and so on. Why should they not vote?
That's a separate argument.
My point is Labour's change to the rules is very politically convenient for themselves. In the most recent polling, 32% of 16-17-year-olds would vote Labour, while only 17% of the overall electorate would vote Labour.
> They didn't run pointless elections by request of the very councils that were due for them, because those areas are being redrawn and would have to have fresh elections almost immediately, making the results meaningless.
They allowed individual incumbent councillors to choose whether elections were cancelled. This was politically convenient for the Labour and Tory parties because the Reform Party is new, and while it's polling well ahead of Labour, it doesn't have many incumbent council seats.
When a court challenge loomed, Labour quickly u-turned on the latest round of cancellations. Funny how something can seem sensible one day, and can then be u-turned at the slightest whiff of legal scrutiny.
> They also gave all the conservative hereditary peers lifetime peerages so they will keep their seats.
Can you name a single Conservative hereditary peer that will be given a lifetime peerage in Starmer's reform plan?
No, you can do things that benefit you electorally, but are also just the right thing to do. Changing the voting system from FPTP would obviously benefit parties other than the major ones, but that doesn't mean it'd be wrong for those parties to do it if they got into power. So the question is if it's good policy, and so I argue it is, if someone can be living by themselves, working in the army or as a full-time apprentice, married, and having a child, they should be able to vote.
> When a court challenge loomed, Labour quickly u-turned on the latest round of cancellations. Funny how something can seem sensible one day, and can then be u-turned at the slightest whiff of legal scrutiny.
Yes, it's absolutely bad that the government isn't making sure these things are legal before doing them, just as with the Palestine Action proscription. It's also hardly a sign of it being gerrymandering, why would they bother when it's going to give them basically zero advantage, given it would only achieve getting a council that will have no time to actually do anything? The obvious conclusion is they thought it was a waste of money and effort to hold them, but if you have to fight a legal battle over it, it won't actually save any money or effort as that has a large cost, even if it is legal.
> Can you name a single Conservative hereditary peer that will be given a lifetime peerage in Starmer's reform plan?
> The BBC understands ministers have offered the Conservatives the chance to retain 15 hereditary members of the House of Lords as life peers.
So it's not specific names as it hasn't been finalised, but 15 of them. I accept I misremembered when I said "all", but the point stands: not gerrymandering.
> No, you can do things that benefit you electorally, but are also just the right thing to do. Changing the voting system from FPTP would obviously benefit parties other than the major ones, but that doesn't mean it'd be wrong for those parties to do it if they got into power
You're reinforcing my point.
Minor parties (who might collectively be popular with the electorate) will never be able to change the voting methodology to their advantage because FPTP keeps the incumbents in place, and only the incumbents have the power to choose the voting system. So democracy suffers and the incumbents benefit.
Similarly, in this case, allowing children to vote helps the incumbents stay in place despite their party, and their leader being deeply unpopular with the electorate overall. So democracy suffers and the incumbents benefit.
This "logic" doesn't track at all. Enfranchising women may have benefited the party, does that mean we shouldn't have given women the vote and doing so hurt democracy? Of course not.
Just because something benefits a singular party doesn't make it antidemocratic. Expanding the franchise is more democratic, not less. A party being rewarded electorally for doing something good is the system working, not failing.
There are reasonable arguments to be made (in my opinion) that 16 is too young but you aren't making that argument, the one you are making is completely invalid.
Yeah, my setup is purely for my own security reasons and interests, so there's very little downside to my scorched earth approach.
I do, however, think that if there was a more widespread scorched earth approach then the issues like those mentioned in the article would be much less common.
In such a world you can say goodbye to any kind of free Wi-Fi, anonymous proxy etc., since all it would take to burn an IP for a year is to run a port scan from it, so nobody would risk letting you use theirs.
Fortunately, real network admins are smarter than that.
Pretty much. I think there's also a responsibility on the part of the network owner to restrict obviously malicious traffic. Allow anonymous people to connect to your network and then perform port scans? I don't really want any traffic from your network then.
Yes, there are less scorched-earth ways of looking at this, but this works for me.
As always, any of this stuff is heavily context specific. Like you said: network admins need to be smart, need to adapt, need to know their own contexts.
This is how you get really annoying restrictions on public networks, because some harmless traffic will inevitably be miscategorized by an overeager firewall/DPI system.
I’m not saying that there should be zero consequences for allowing bad traffic from your network, but there’s a balance, and I would hate a world in which your policy were more common.
Arguably we are already partially living in that world, as some companies are already blanket-banning entire countries, VPNs etc., rather than coming up with more fine-grained strategies or improving their authentication systems to make brute force login attempts harder. It’s incredibly annoying.
Not all of us have cell plans with hotspots ($$$), hotspots often have data caps, cell is often slower or congested, and there are some areas without cell signal. It's also kind of silly from a wider perspective to shove everyone onto the cellular network when most businesses have perfectly decent fiber internet nowadays.
Sure, I'm usually on hotspot, but I personally appreciate when businesses have wifi. Either way, there are always going to be shared networks somewhere.
What we should actually be doing is WiFi using SIM cards as authentication.
Have it count against your data cap (but make it much cheaper than cellular data). Pay part of that revenue to hotspot-owning businesses. If something bad happens, use the logs that telecoms are already required to keep.
It's very strange to me that we don't have something like this already.
How about we don't? We really don't need to tie even more things to SIM cards and phone numbers.
Criminals have more than enough ways to still get anonymous SIM cards (at least until every country on the planet makes KYC mandatory for prepaid SIMs), and legitimate users are greatly inconvenienced by this.
> Pay part of that revenue to hotspot-owning businesses.
To subsidize a network connection they probably already need for their business operations, e.g. their payment terminal or POS? Why should I? The marginal cost of an incremental byte on wired Internet connections is basically zero, these days. It's literally too cheap to meter, so why bother?
Besides the centralization and tracking concerns, not nearly every device has a SIM card. Why does my Laptop not deserve to access a coffee shop Wi-Fi, my Kindle to use an in-flight conenction, or my smartwatch to use the gym's network for podcasts?
It's very strange to me that people keep trying to willingly ruin the open Internet.
I live in a country that has mandatory SIM registration, and it's stopping exactly zero organized criminals – these can just pay a tiny bit more and buy burner phones and use out-of-country SIM cards – while it's making life more complicated and expensive for the average citizen.
Expensive because KYC isn't cheap, and guess who pays for that in the end... And that is assuming that your form of ID is even accepted as a foreigner. In a different country, I literally just spent two days sending back and forth selfies holding my passport(!) to little success. And I guess the customer support reps could now just use the same photos to impersonate me elsewhere, since passport photos provide absolutely zero domain binding and are just about the dumbest thing still seeing widespread adoption.
I don't often use registration-free public Wi-Fis, but I love that they exist, and I would hate if they'd be taken away too. I also just transited at an airport that requires passport scans for Wi-Fi usage, and it feels so backwards.
Thanks for being honest about this, though. I was always wondering who all these people were that are seriously in favor of all this dystopian stuff. Would love to hear why you think that it's a net positive for society.
> What an incredibly short-sighted, dystopian view.
You do recognize that the person I kept replying to was not asking these questions in earnest, right? They were all carefully directed questions, specifically designed to confirm their world view. I played into it, because I think they're pitiful and hilarious. Serves them right. Their latest question about government criticisms completes the caricature perfectly. All they're missing is referencing or quoting Orwell.
> I live in a country that has mandatory SIM registration, and it's stopping exactly zero organized criminals – these can just pay a tiny bit more and buy burner phones and use out-of-country SIM cards – while it's making life more complicated and expensive for the average citizen.
Pretty much the same here to my understanding. There's no credible evidence I'm aware of that'd suggest the criminal use of phone networks decreased significantly thanks to these. It might have improved on the exhaustion rate of the numbering pool, but I don't think we were particularly close to exhausting it anyways. Most benefit I can think of is a chance at traceability, but how well realized vs abused that is, no idea. Just like with IP leasing described in the article above, enlisting the help SIM mules has a long standing tradition, after all.
Any addressing system that relies on non-cryptographic identifiers will be prone to all kinds of mass misuse. There's no amount of lawmaking, honest or not, that could be implemented to counteract these. It's just like email.
> Thanks for being honest about this, though.
Except I really wasn't, and I find it both remarkably funny but also extremely concerning how on board you guys are with it. Propaganda and culture sure are powerful.
The current ways of identity verification are broken, and are prone to enable surveillance: this is something I fully recognize. What I refuse to recognize however is that the concept of identity verification would be wrong wholesale. There was another thread on here a few days ago that I did comment on, but the bottom line is, in my understanding there's no mathematical reason that things would have to be this way. Its shortcomings, including its enablement of mass surveillance, are an implementation issue, not something fundamental to the idea per se.
Being able to trust that a stranger you're talking to is
- an actual specific person
- is actually a stranger
are bottom of the barrel human expectations that communications technology have completely shattered. Technologically guaranteeing these, to the extent the analog hole problem allows for it, does not require dystopian practices. I'm confident that the lack of these guarantees is the root of many societal problems we see at large today. For better or for worse, a lot of people live a lot of their lives on the internet these days, but the internet is no hospitable place for them, among else for these exact reasons.
Accountability is a good thing. I refuse to let it be monkey paw-d by people who mean unwell into being recognized as a tool for evil, and I think you should too. Trust being abused by a centralized system does not mean trust is wrong. It means there are abusers at the wheel. The solution is not mistrust, or even systems that require less trust necessarily, although both can be useful. The solution is reworking the system to get more trustworthy people into the leading positions, and to make it so that those who have demonstrated to be not deserving are thrown out more readily. It is most unfortunate that this listing is ordered exactly by difficulty, from easiest to hardest. Trust is easily broken, and human systems are impossibly hard to get right. I don't think this justifies giving up though.
My profile is not blank. You can page through all my comments, posts, and favorites to your liking.
Did you actually bother to understand what I said by the way? Are you able to formulate a post that isn't just a bare minimum asinine rhetorical question?
> The current ways of identity verification are broken, and are prone to enable surveillance: this is something I fully recognize. What I refuse to recognize however is that the concept of identity verification would be wrong wholesale. There was another thread on here a few days ago that I did comment on, but the bottom line is, in my understanding there's no mathematical reason that things would have to be this way. Its shortcomings, including its enablement of mass surveillance, are an implementation issue, not something fundamental to the idea per se.
Put into more exact terms, your way of wanting to verify my identity is the same one you criticize governments and businesses for doing. It is not one I think is a good idea either, despite how you're trying to present this. I just retain the opportunity for there being other, better ways, whereas you don't.
Mind you, there's no reason to think that those who do publish such information do it because they're here to champion accountability. Note the type of forum this was originally supposed to be. It's in part a place for self-advertising. Many contact details you find on bios are visibly and explicitly HN specific.
Haha, nice, I run something similar.. But more manualy managed and I put those bans pernametly. Currneltly, there are 1360 blocks in drop list and growing.
I never really remove them, because even those leased blocks move from one spam/abuse operator to another, so no big loss.
And indeed, if people would fight w/ spam/abuse better and more aggresivly, the problem would be much smaller. I dont care anymore, In my opinion Internet is done. Time to start building overlay networks with services for good guys...
If you actually wanted your site or service to be accessible you’d run in to issues immediately since once IP would have cycled between hundreds of homes in a year.
It's crazy to me that you'd trust the output of an LLM for that. It's something where if you do it wrong it could cause major damage, and LLMs are literally famous for creating plausbile-sounding but wrong output.
If you wanted to use an LLM to identify it, sure, you can validate that, and then find the manufacturer instructions and use those. Just following what it says about the cables without any validation it's correct is just wild to me. These are products with instruction manuals made for them specifically designed for this.
> It's crazy to me that you'd trust the output of an LLM for that. It's something where if you do it wrong it could cause major damage,
With critical tasks you need to cross reference multiple AI, start by running 4 deep reports, on Claude, ChatGPT, Gemini and Perplexity, then put all of them into a comparative - critical analysis round. This reduces variance, the models are different, and using different search tools, you can even send them in different directions, one searches blogs, one reddit, etc.
Or you can ask for a link to the manual. I genuinely can't tell if your post is real advice or sarcasm intended to highlight the insanity of trying to fit square pegs in round holes of using LLMs for everything.
It doesn't matter, because any process that seems right most of the time but occasionally is wrong in subtle, hard to spot ways is basically a machine to lull people into not checking, so stuff will always slip through.
It's just like the cars driving themselves but you need to be able to jump in if there is a mistake, humans are not going to react as fast as if they were driving, because they aren't going to be engaged, and no one can stay as engaged as they were when they were doing it themselves.
We need to stop pretending we can tell people they "just" need to check things from LLMs for accuracy, it's a process that inevitably leads to people not checking and things slipping through. Pretending it's the people's fault when essentially everyone using it would eventually end up doing that is stupid and won't solve the core problem.
what's the core problem tho? Because if the core problem is "using ai", then it's an inevitable outcome - ai will be used, and there are always incentive to cut costs maximally.
So realistically, the solution is to punish mistakes. We do this for bridges that collapse, for driver mistakes on roads, etc. The "easy" fix is to make punishment harsher for mistakes - whether it's LLM or not, the pedigree of the mistake is irrelevant.
The core problem is that the tool provides output that looks right and is right a lot of the time, but also slips in incorrect stuff in a hard to notice way.
Punishment isn't a problem because it doesn't work. If you create a system that lulls people into a sense of security, no punishment will stop them because they aren't doing it thinking "it's worth the risk", it's that they don't see the risk. There are so many examples of this, it's weird people still think this actually works.
Furthermore, it becomes a liability-washing tool: companies will tell employees they have to take the time to check things, but then not give them the time required to actually check everything, and then blame employees when they do the only thing they can: let stuff slip.
If you want to use LLMs for this kind of thing, you need to create systems around them that make it hard to make the mistakes. As an example (obviously not a complete solution, just one part): if they cite a source, there should be a mandated automatic check that goes to that source, validates it exists, and that the cited text is actually there, not using LLMs. Exact solutions will vary based on the specific use case.
An example from outside LLMs: we told users they should check the URL bar as a solution to phishing. In theory a user could always make sure they were on the right page and stop attacks. In practice people were always going to slip up. The correct solution was automated tooling that validates the URL (e.g: password managers, passkeys).
> The correct solution was automated tooling that validates the URL
that's because this particular problem has a solution.
The issue here is that there's no such a tool to automatically validate the output of the LLM - at least, not yet, and i don't see the theoretical way to do it either.
And you're making the punishment as being getting fired from the job - which is true, but the company making the mistake also gets punished (or should be, if regulatory capture hasn't happened...). This results in direct losses for the company and shareholders (in the form of a fine, recalls and/or replacements etc).
> The issue here is that there's no such a tool to automatically validate the output of the LLM - at least, not yet, and i don't see the theoretical way to do it either.
Yeah, it's never going to be possible to validate everything automatically, but you may be able to make the tool valuable enough to justify using it if you can make errors easier to spot. In all cases you need to ask if there is actually any gain from using the LLM and checking it, or if doing so well enough actually takes enough time that it loses it's value. My point is that just blaming the user isn't a good solution.
> And you're making the punishment as being getting fired from the job - which is true, but the company making the mistake also gets punished (or should be, if regulatory capture hasn't happened...). This results in direct losses for the company and shareholders (in the form of a fine, recalls and/or replacements etc).
Yes, regulation needs to be strong because companies can accept these things as a cost of doing business and will do so, but people losing their jobs can be life destroying. If companies are going to not give people the time and tools to check this stuff, then the buck should stop with them not the employees that they are forcing to take risks.
The human is responsible. That's the fix. I don't care if you got the results from an LLM or from reading cracks in the sidewalk; you are responsible for what you say, and especially for what you say professionally. I mean, that's almost the definition of a professional.
And if you can't play by those rules, then maybe you aren't a professional, even if you happened to sneak your way into a job where professionalism is expected.
This doesn't solve the problem, because companies will force people to use these tools and demand they work faster, eventually resulting in people slipping.
People will have to choose between being fired for being "too slow", or taking the risk they end up liable. Most people can't afford to just lose their job, and will end up being pressured into taking the risk, then the companies will liability-wash by giving them the responsibility.
You need regulation that ensures companies can't just push the risk onto employees who can be rotated out to take the blame for mistakes.
Right, but companies routinely accept fines as costs of doing business, while losing your job can destroy your life. If a company has not taken appropriate measures to ensure employees can reasonably catch errors at the rate they are required to work, then the company should take all the blame, because they are choosing to push employees to take risks.
To be fair, that's a problem with human authors too. Wikipedia is really well-cited, but it's common to check a citation and find it only says half of what a sentence does, while the rest seemingly has no basis in fact. Judges are supposed to actually read the citations to not only confirm the case exists and says what's being claimed, but often to also compare & contrast the situations to ensure that principle is applicable to the case at hand.
Yup. The issue with LLMs are not that any specific thing it is doing is unique. Rather that it does it in previously unimaginable volume, scale, and accessibility.
Even disregarding self driving features, it seems like the smarter we make cars the dumber the drivers are. DRLs are great, until they allow you to drive around all night long with no tail lights and dim front lighting because you’re not paying enough attention to what’s actually turned on.
I think game engine tooling tends to encourage bad code too, lots of game engine make it hard to do everything in code, rather things are special cased through UIs or magic in the engine, which means you often can't use all the normal language features, and have to do things in awkward ways to fit the tooling.
Of course, this varies a lot by engine.
reply