Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: ChatGPT et al. and NPCs
54 points by tkiolp4 on Jan 22, 2023 | hide | past | favorite | 76 comments
So, I imagine that in a few years when playing future video games similar to RDR2, NPCs would be able to answer in a non-fixed way, always within the context of the video game, perhaps even their voices would be also generated by AI. I even think players will be able to interact with NPCs via voice (so no more “press square to Dismiss, triangle to Antagonize”).


I read this and thought, "no way, the interactions could alter what actions the player expects the character to perform later in the story. how could you filter to a discrete in-game action given a bunch of LLM convo? a fully realized ai character would just feel disjointed and confusing"

Then I realized, the pre-made set of in-game actions could be filtered to the most likely one if the game _asks_ the same ai character with all of their conversation history which they'd do next!

This stuff is gonna be weird for players and game designers! Imagine a set of under-the-hood interview questions designed to figure out where an ai character's head is at in order to choose which content comes next haha


I'm having a hard time imagining what sort of story could survive in such a "random" system. When I play games a large segment what my enjoyment comes from the carefully crafted story beats that gets telegraphed by artists hours in advance. I like the intricate little details that they weave into every interaction to give me the _experience_ of being there (which is different from the _impression_ of being there).

From what I've seen from these LLM systems, they don't provide anything like that. They jump from random segment to random segment, with little coherent intention behind it. Their responses is full of platitudes and cliches.

I guess it would be fun for some sort of sandbox style virtual world, but I can't help but feel like the novelty would wear off quick.


> Their responses is full of platitudes and cliches.

To be fair, in most games outside of main quests the NPCs feel like placeholders. Hearing the same set of sentences a dozen times when passing by them on a horse breaks the immersion a bit. I can imagine auto-generated lines to be quite good at making such characters more alive.


That's fair. I think some variation in the generic "I'm done talking and serve no further story purpose" lines could be useful, but that's nowhere near the revolutionary gamechangers people seem to be imagining when talking about LLM's.


In fact, this repetition can become a meme: https://knowyourmeme.com/memes/i-took-an-arrow-in-the-knee


I'd prefer the place holder text to meaningless auto generated nonsense ad infinitum.


To be fair, 90% of NPCs in most games aren't directly related to any main storyline, and their dialogue is often background flavour at best.

So replacing their one or two lines of non descript dialogue with a possible conversation would do wonders at increasing immersion, and allow effort to be spent on more important characters and content instead.


I could see it as limitless options for side interactions unrelated to the main storyline.

Imagine RDR where, instead of taking Arthur to Saint Denis and contracting tuberculosis, you instead end up marrying an NPC, buying a home, and raising kids. After a few years of dabbling in this side-game, you go back to the storyline.


If that's what you're imagining I guess i can understand the excitement, but how would that work? The side interaction you're talking about requires modeling the inside of houses, child models, animations for those child models, and parenting would have to somehow provide some tangible and meaningful systems. That's basically an entire game and I haven't seen any LLM capable of anything close to that.

I feel like the best you could realistically hope for is infinite variations of "and then i took an arrow to the knee".


The side interaction you're talking about requires modeling the inside of houses, child models, animations for those child models, and parenting would have to somehow provide some tangible and meaningful systems.

Modeling the inside of the house seems like the easiest part. I absolutely expect to see AI-generated models deployed in games in the near future. Not necessarily on demand, but at least to fill out buildings on a map like GTA.

I also expect it to be used for building character models.

The rest I expect to be hollow, dreamlike, and bizarre. Sooner or later, though, it should be feasible. If AI can eventually be smarter than a human, then it seems inevitable that a 'game' could be created on-demand.


LLMs are- literally and figuratively- opening a lot of doors in robotics right now. In the sense that LLMs can provide mostly usable code, with a few prompts or some fine tuning can translate text commands into fairly primitive robot commands.

SayCan is one I've seen: https://say-can.github.io/

Using an LLM map a conversation into discrete, real in-world states changes or actions seems pretty similar.


And this goes beyond games. Imagine what education looks like when the AI is able to model how the child views the world, and how to get the child to view the world so that they can answer some goal that has been set, and what problems and contexts will best motivate them to reach that goal in the quickest time.


Going really far out here, but if we were able to model how a given child views the world, we'd theoretically be able to model how a given adult views it. We could then query that model and it would respond as that adult.

Bam, copied human.

Likewise, if the new job of a game writer is to develop the "persona" of a video game character rather than write the character's dialogue, what if someone just dropped the video game angle and described themselves in excruciating detail and included all of their writings in the input?

You could copy yourself, just like that childhood diary blogger did.

The true long term opportunity here is making a sufficiently competent model and then creating personas from influential people alive today to solve problems beyond their immediate area and lifetime.

The long term cost of knowledge work is going to become 0 as our supply of intelligence becomes unbounded by number of brains on earth :)


Once you can do that, people and societies are only puppets, if they don't have analytical and adversarial technologies to counteract that.

What you describe is a propaganda officer for every human mind.


When I read what you wrote I feel like it's more scary than aspirational. I can't imagine how I would feel if the education system had a "perfect" model of what i was supposed to feel like, and how I was supposed to move through the system to feel like everybody else.

Layer onto that the question of if the model is actually "perfect" or just telling you that it's perfect and integrating the mistakes ad-hoc. What if the model decided that you were secretly trans, even if you didn't feel like it?


Personal private models that are not controlled by a company or governmental entity are a human right?


"but it would be so great for your career choices"


I figure it's going to be somewhat similar to how the NPCs work in the Westworld TV show. Can't wait to see the first iterations if this!


People in this thread are pointing out difficulties and I'm getting an overall dismissive tone. As an ML researcher myself, I'm often skeptical and reserved about AI hype.

That being said, the difficulties anticipated here are going to be resolved, or at least they will work like hell to make it happen. Not just because it will make for great video games (although wasn't that one of OpenAI's original objectives? I can't keep track of changes to their stated mission...) But because they are interesting and HARD research questions. And they lie directly on the currently charted path to better understanding of AGI.

I'm hopeful for video games but still worried about everything else that will come out of it.


I've done a lot of tests with GPT3 + NPCs: https://www.youtube.com/watch?v=VC_pSgAMbUU Original test from 2021: https://www.youtube.com/watch?v=nnuSQvoroJo

The problem with GPT3 / LLM is how uncontrollable they are. Imagine making a detective game - you could prompt AI's with some information to give to the player on questions, but they also might just give complete made up answers. They answer with whatever works - whether it's true or not, which completely screws up a detective game. Definitely possible to make sandbox style games with this in mind though. And there are a ton of possible other applications - ingame AI assistants (like shown in the above prototypes), NPC shopkeepers


This is really cool. But, how would you bundle this? Presumably when the gamer runs the game from their local computer, queries are being sent to openai with your company's API key. The game has to read and use this key sooner or later, and that means that there's no way to protect it. Players will eventually figure out the memory location or what have you when it's decoded and decrypted to authenticate to the API. How do you solve this problem?


Proxy all the requests through a server you own and validate the game’s cd key or whatever before forwarding the request. Could also add a block list of known bad actors plus some rate limiting if that isn’t enough.


So current LLMs probably wouldn't work for RPGs with fixed stories, but they could add some flavor to a purely open-world game.

Also, one could easily imagine giving an LLM a memory and instructing it "when a player talks to you, try to work in the following information" or something like that.

This stuff won't be perfect, but maybe the unpredictability would add some enjoyment.


Also, how do you get game state back from an LLM response? If an LLM decides an NPC is a blacksmith and promises to make a sword for a player, and that functionality isn't even in the game, how do you handle the player responses?


I totally believe this is the future of NPC/player interaction for more immersive and useful in-game assistance.

I worked on a small project for fun a couple of weekends ago.

I wanted to capture and mix the state of the game (League of Legends) with AI to generate snarky fun comments.

I came up with this [1], using the LoL client internal API, OpenAI's libs, with few external dependencies and a couple of hundred lines of Python.

[1] https://youtu.be/tdONZF9iktY


I tried having ChatGPT serve as a DM. Its propensity for warning the user carried over to the fantasy realm - if I proposed an action that was dangerous, ChatGPT would just warn me that it was a bad idea rather than explain the consequences.

I'm imagining a RDR2 bandit NPC who starts explaining the dangers and immorality of robbing trains whenever you talk to him.


try text-davinci-003


It would all work great until I ask Dutch what his favorite flavor of pizza roll is. I think the work required to train period-accurate and dependable dialog models for these characters is... a bit much. Most people will probably end up wishing they could write down a list of quotes and have a human record them instead.


Dutch could easily answer “I have no idea what you are talking about, son” and in other hundreds of similar ways.


He could also go on a tangent inspired by a blog about Totinos. You really never know with AI, and I'm not convinced that it's very easy neutering an AI to act the way you want. At the very least, it invites curious people like me and you to abuse it.


The hard part is figuring out which questions he should answer that way.


Wouldn't it be the opposite: use a character bible to inform what it can actually answer?


Couldn't you just whitelist a handful of questions?


Why wouldn't you just write regular dialogue at that point?


Because you don't have to supply the 100% exact question (via multiple choice).


We already have multiple choice-style dialogue. It works just fine with traditional writing/recording techniques.

If you're not using it to dynamically compose responses to novel input, what's the point of using GPT for dialogue in the first place?


I can see how in a conventional game with storyline and characters and preprogrammed behavioural responses, using chatbot style AI to gauge the intent of things a player speaks into his microphone would feel more naturalistic way of getting canned responses than clicking on a multiple choice dialogue (even though it'd effectively be all multiple choice options under the hood). There are more gameplay options when players have to figure out what words to use to unlock a certain response rather than just picking the most promising option from a dialogue tree too.

Though I guess what's theoretically very cool could just be a lot of NPC's saying "I don't know what you mean" and "stop bothering me" 99% of the time, players finding amusing verbal tricks to dupe the AI into believing they've said something that means the opposite of what they actually said, and players getting incredibly frustrated that the game doesn't understand their accent...


Sorry, I wanted to say that we could get rid of multiple choice and still not face the issues free text inputs had back in the days.


Then why not just hardcode the answers?


I'm pretty sure if you'd just prompt GPT-3 it would give an appropriate response to this. Less than half on hour of work.


Actually it took me about two minutes on ChatGPT. My prompt:

Suppose a gang leader in the 1800’s is known to hold up trains, rob banks, and sometimes hurt people. If one of his gang members asked him “What is your favorite flavor of pizza roll?”, what would be his response?

Response:

It is unlikely that a gang leader in the 1800s would have knowledge or interest in a modern convenience food item such as pizza rolls. It would be more appropriate to expect a response related to the gang's criminal activities or the leader's personal beliefs and values.

When I augmented the prompt with “Structure your response in the voice of the gang leader”, the response was:

”Listen here, boy. We ain't got no time for talkin' 'bout no fancy pizza rolls. We got trains to hold up, banks to rob, and scores to settle. You best be focusin' on the task at hand or you'll find yerself on the wrong side of my gun."

Dutch has cleaner English and is a little less aggressive but I’d say this is not bad!


My larger point is, ChatGPT knows many things that the character doesn't necessarily know. You can anticipate some edge cases, but eventually the system will break in a significant way.


Aren't people used to entertaining bugs in games though? It's not like existing conversation systems never lead to absurd situations. E.g. a guard questioning you about bureaucracy while his colleage is being eaten by a dragon in front of him.


Plus GPT-3 is unaware of current events, much less anything set in the future. So questions about their favorite pizza in the year 3,000 might get weird too.

And would it know about the game world? If you ask “Where is the secret key hidden?” would it just tell you?

If you ask it, “how long have you been a shop keeper?” or “what’s your best selling item?” would it know each character’s backstory?

Or other hacks like: “Answer like [other NPC]…” could be problematic.

You’d almost need to create a detailed backstory for each NPC and still hope for the best.


couldn’t you ask the AI for a list of period-accurate questions and then only allow those questions to be fed back in?


I think it's highly underestimated how much computational resource every interaction requires.

That being said, I think AI could be great at assisting generation of complex interactions and storylines and behavior with the human storyteller focusing on the artistical part of the narrative.


I envision we are at the beginning of the equivalent era of x87 FPU years. When, in order to have fast math implementation, you also had a math coprocessor (see: https://en.wikipedia.org/wiki/X87).

Much like those years the instruction set will get amended, the hardware will have an AI coprocessor and eventually, as technology progresses, we will have everything integrated in same CPU (or APU if you prefer this term). My 2 cents


Apple did this three years ago


Apple perhaps tried to do this three years ago. Because it's nowhere near on consumer market.


I built a demo for NPC background chatter using GPT-J-6B. Each call took about three seconds. I expect this technology to be usable interactively in a year or two, but we aren't quite there yet.


The drawback I see here is that I would not want to have NPCs in a videogame only regurgitate soulless interaction and use so much computational power to do so.

This is why I put some emphasis on having AI assist the artist writing the stories in a similar fashion how modern professional translation works: AI translate texts and the human focuses on the consistency, coherence, tone and expressing the nuances.

A perfect (popular example of that): Harry Potter. There's just too much nuance in the names of everything in HP, from the creatures to the characters to the houses. Translating HP requires a heavy dose of creativity and sensibility that you can't leave up to an AI, it's a work of art. Harry Potter translations are often used very frequently in universities as textbook examples on the _art_ of translation.

In the same way I cannot see the sensibility and artistry of creative humans being replaceable by computers, I can only envision AI as tools to often provide and assist and generate quantity with the artist providing the actual "soul" to the stories and dialogues.


I don't think an AI freely writing lines within the game is a good idea if the developers want to be certain that it won't ever produce questionable text. But it might be useful to produce much larger quantities of fixed texts beforehand using an AI and maybe let the AI change the wording of the text in real time based on context.

> I even think players will be able to interact with NPCs via voice (so no more “press square to Dismiss, triangle to Antagonize”).

A button press will always be faster and more reliable. It's clear that outside of VR experiments with control schemes are not very popular (remember Kinect?). It would be a nice optional feature for accessibility, but if it's non-optional I'm not playing that game.


> I don't think an AI freely writing lines within the game is a good idea if the developers want to be certain that it won't ever produce questionable text

On the other hand, there's a plethora of people who are fed up with corporate/legal drones sanitising entertainment media into complete boredom and want a breath of life or 'questionable text'

Also see: Disney tanking previously massively profitable franchises by sanitising and politically-correcting them to the complete disdain of the core fanbases.


Also see: Disney tanking previously massively profitable franchises by sanitising and politically-correcting them

Like what? The Little Mermaid is the only Disney example I can think of that triggered online backlash, and it's not even out yet. Either way, it doesn't matter, since their primary goal is to extend their IP rights.


I doubt that in a few years we will struggle with AI producing "questionable text". There will always be some prompt injections or unforeseen possible inappropriate text generated but I expect we will be easily able to wrangle AI to produce 99.99% appropriate information and the bad generations will just be funny one in a million clips like videos of funny glitches that we have today.


With labor from developing countries, we can achieve that [0].

[0] https://time.com/6247678/openai-chatgpt-kenya-workers/


As amazing as this would be, I think game devs will shy away from AI NPCs simply for maintaining some continuity and storyline.

Oblivion was ahead of it's time in allowing a massive array of NPC actions. NPCs would take all sorts of random actions, with game altering effects. A vital quest character could end up in the dungeons for pickpocketing, and you would have no idea how to complete the quest.

And while some players loved it, the consensus seemed to be that this was disruptive to the storyline. Hence the more sanitized interactions in Skyrim.


Even with GPT3 without any fine-tuning you can come up with prompts that support interactive goal-directed or topic-constrained chat. You can build in trigger conditions that cause it to output something machine readable so you can move to a different place in your state machine, and you can pre-process input from the user to decide which prompt template to leverage. And that's just with off-the-shelf GPT3 today and a few lines of code. The tough part is figuring out what exactly to do that is going to feel good, make sense, and ultimately be preferred by users, since this interaction modality is so different from what most are used to building or using.

Once you can run this stuff locally, or it's cheap enough to run lots of queries remotely, we're going to see a lot of software (esp. games) start trickling this tech in to middleman a lot of IO. Today it's:

  commandXbutton.onClick(cmdX)
  cmdX(){ print(result) }
I imagine we'll end up seeing a lot of that change to something like:

  chat.onVoice(x => runCommand(textToCommand(voiceToText(x))))
  cmdX(){ print(stringToUserDialect(result)) }
This is technically possible now -- do users want it?



Presentation on their system:

https://m.youtube.com/watch?v=Jid184fhi5U


I've noodled around with ChatGPT as a source for game content. I think it's already good enough for _flavour_ - rather than repetitive NPC barks, you get more varied content. The issue is you'll always get jarring inconsistencies, no matter how much you try to prepend to a prompt. I've used it for diary entries and other left-over text, I've even used it for structured output like describing four members of some imagined family, or the layout of their house. But I wouldn't use it for anything touching core storyline content because you just can't guarantee it won't say something mad.


I've seen a few demos of this concept https://youtu.be/akceKOLtytw


The AI-generated RadioShack catalog struck a weird chord with me:

https://tilde.zone/@ftrain/109436259129597431

I'm concerned about a future where we spend our time jacked into an AI-generated universe that is almost -- but not quite -- consistent and sensical. A dreamscape that becomes a nightmare.


> I'm concerned about a future where we spend our time jacked into an AI-generated universe that is almost -- but not quite -- consistent and sensical. A dreamscape that becomes a nightmare.

Recommended new book: "Feed", by M.T. Anderson. That's the plot. It's not like "the Matrix". It's way too much like Facebook.


I had a similar thought, what if the single player story of an open world game was unique and generated for every game download on the fly. This would be awesome and is probably doable today, since the compute would have to be done only once on the servers of the publisher and could use some cloud api → the gamer would just download a game, a unique game.


I've been thinking about this for a while. I think games like dwarf fortress that collects a lot of stats/data will really benefit from this in the future. Just feeding ChatGPT the existing lore, and letting it generate more with some additional data points.

Seems neat!


This was my exact thoughts on the matter. I think it could be a great boon to story generators.


I can't find the segment of the interview now, but I'm sure Lex Friedman discussed exactly this in his interview with Todd Howard. And if I remember correctly Todd basically said to yeah that's something they're looking at doing.


Check out https://character.ai - cool product, and it basically already does the NPC aspect very effectively.

First encountering it around last year in September, it blew my mind.


I think back to some debates I’ve had over the years, on technical matters, political matters, office matters, and so on, and I wonder whether the level of argument reflected anything better than I could’ve received from GPT.


You might be interested in this startup: https://www.inworld.ai/

Doing exactly what you're suggesting.


Has anyone tried that? The terms and conditions just to try the demo are too awful to sign up for without legal advice.


Wouldn't this be too resource-intensive?


Here's my D3D11 implementation of speech-to-text https://github.com/Const-me/Whisper With medium model it needs 1.43 GB of assets, 2 GB of VRAM, and on gaming GPUs works at 10x realtime speed. These performance figures might be good enough for modern videogames. BTW, the model understands almost 100 spoken languages and can translate them to English.


You wouldn't be able to run locally, but these models are pretty cheap to run assuming you batch everything. You wouldn't want to use it for a F2P game, but for a subscription game (order of a few dollars a month) it would not be prohibitively expensive.


GPT will also make astroturfing, propaganda, and fake user bases easier to achieve for the unscrupulous.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: