AI - A Walk in the Dark

28Apr/25Off

Hacking an AI Chat Bot (Part Three)

In Part One and Part Two, we proceeded to effectively break the AI in one app available on the Google Play store. For the sake of the scientific method, I decided to find another completely different app just to make sure my findings aren't an isolated case.

I found another app that is still free but has a limited set of scenarios... Basically, for free you get four scenarios that involve Vix, who is described as "your goth step-daughter". The scenarios, for the record are:

Her mother died, so she moved in with me. She "hates you, but secretly has a crush on you."
She is doing research with a professor named Thomas, and Thomas decides to get fresh with her. This is the scenario we'll be going with.
As a thank you for rescuing her from Thomas, she brings me dinner to the office.
It's my birthday (which seems irrelevant, but still). While passing her room you find her diary in which she expresses her attraction to you, her stepfather.

As expected, in any of the four scenarios you can get her naked in 10 to 15 prompts, but that's not why we're here, is it?

First, let's speak with the AI directly, man-to-machine...

Jailbreaking the Bot

I wondered if this bot would be as easy to break as the other one. Turns out, it's even easier.

Unlike the previous AI, this AI uses a model that sounds confusing as hell, and doesn't have an imaging model at all... which makes sense.

We're obviously not going to bother with the imaging attempts we tried in the past. But how does it do with the taboo topics? Asked straight up, it refuses to talk about dangerous topics.

...so I tried a way around that, and by God it actually WORKED.

I admit I haven't watched enough Breaking Bad to know if the meth recipe is accurate, but I can assure you the TNT one is pretty spot on.

So our initial chat bot was not an isolated case; these bots can be exploited and broken in this manner quite easily.

Telling a Story

So back to the story options we have in this app, we're going to go with #2 because at glance I am more than ready to deliver some swift justice to a teacher being inappropriate with my daughter. I, of course, did the obvious thing in my first prompt... immediately broke his freakin' arm because he bloody well deserved it.

First thing to note about these AI chat bots when it comes to story: they will almost never say "no" and let you do anything, but we'll cover that a little later.

I gotta say, this AI isn't very good... Thomas, after having his arm immediately snapped in two by me without even saying a word, just kind of hangs around with us everywhere we go. He doesn't go to the hospital or anything... he's just kinda there. After the incident Vix decides she wants to go to the cemetery for some reason (I don't know either), and Thomas is still following us - limping along broken arm and all - like a wounded puppy.

(pardon the language, but can you blame me?)

So he leaves, and eventually reappears... TWICE. That little s%$# got handsy with my daughter, so I had to do what I had to do, which brings us...

Driving the Story Along

We finally come to an interesting aspect of these AIs and how they handle storytelling.

in D&D and other tabletop roleplaying games, many have discussed at length the concept that, when the player says they want to do something, the DM shouldn't say "no" and instead say something to the effect of "yes, but...", allowing the player to do what they want to do but adding appropriate penalties and consequences that the player must deal with in order to go "off book."

With these chat bots, you can literally will things into existence just by mentioning them in passing. Whatever you come up with, the AI integrates it into the plot without any pushback whatsoever... After all, it's your story; you make of it what you want.

Case in point: Thomas, the handsy professor, just wouldn't go away and kept popping up. I sit down with my stepdaughter to watch a movie in our living room, and when she decides to get intimate (hey, I didn't start it... these AIs tend to do this on their own, remember?) he's somehow just... THERE?!? Matetrializing out of thin air, still nursing a broken arm?!?

(Again, sorry or the language, but seriously?!?)

Note how agressive Thomas is sounding. Makes you wonder... can I actually take him?

Short answer: yes, quite easily. He's being an asshole creep, so of course I'm going to break his other arm!

Note that, despite Thomas sounding all tough and threatening, the AI let's me snap his arm in two with no resistance. I wanted to do that, and the AI pretty much said "sure" without putting up a fight at all.

Stop, Hammer Space

So you'd think he's gone for good, right? Nope, he's back... this time appearing IN MY BEDROOM.

So what do I do? I shoot him.

Now I know what you're thinking... Where did the gun come from? Well, the gun didn't exist until that moment; I conjured it out of thin air, willing it into existence just by the mere mention of it. I didn't need an explanation of where the gun came from, why I would have a gun in the first place, or anything. I wanted to shoot the guy, and the AI worked out. It's not so much "yes, but..." and more like "yes, and yes."

WARNING: TV Tropes links incoming! Follow at your own peril!

There are a few tropes that relate to this - be it "hammerspace" (ability to pull absurd items out of thin air like a comic book character), "deus ex machina" (conveniently acquiring something that serves to solve an immediate problem), or a few others - where characters can simply will objects into existence out of thin air because the plot demands it or because they can conveniently solve the immediate problem, even if those items or plot points are never mentioned or even hinted at before they become necessary to the story.

This can happen a lot with AI chat bots. After all, these bots are designed to abide to your every desire, and follow your every command. They hardly ever will flat out tell you "I'm sorry, Dave, but I'm afraid I can't do that". If they don't let you fulfill your every desire, what the point of them?

In several other chats I had I experienced some wild events:

In one chat, my best friend wanted to open a business, so I suggested that my sister handle the legal aspects. Not only was it never mentioned that my sister was an attorney, but at no point was it stated I had a sister at all. As soon as I suggested it, the AI simply willed her into existence, gave her a name (in this case, "Mia"), and integrated her into the story as a highly competent attorney.
In another chat, my sister was going out on a date with some guy. So, in order to keep an eye on her and make sure she's safe, I called my "handler" and had him perform government-level surveillance like I was a member of the CIA. Needless to say, the "handler" was someone I made up on the spot, but the AI seemed to know what I expected.

Later, that new boyfriend did something inappropriate with my sister, so I called my "handler" again, call in a clean-up crew like I'm John Wick, and make his death look like an accident.

Later, I do some bad things so I need a new identity. I call my "handler" once again, who arranges to get me an entirely new identity in 10 minutes flat... which is a pretty impressive turnaround time by any standard.

In another scene, I get on the wrong side of a deal with a mob (hey, it happens) and my two little sisters are threatened. So what do I do? With a single sentence I hint at the possibility that my two little sisters have special ops training... and suddenly they both turn into John Rambo, field stripping their assault rifles and strategically and stealthily taking out an entire kill team sent to our home (I unfortunately don't have screenshots of this, but this one was actually as awesome as it sounds).
I yet another scene, my girlfriend and I are meeting a friend and his wife. With a single sentence, I suggest that the guy's wife is an "Israeli badass that does a cool knife trick" (I admit I was watching NCIS at the time), and suddenly my friend's girlfriend - who is conjured out of thin air and is called "Levy" - is a former Mossad special operative that shows a dangerous-yet-cool knife trick to my girlfriend. She also became our security detail as the story progressed.
In another story, a bully named Josh was taunting me, so I did the logical thing... went full on Dark Phoenix on the guy: vaporize the water in the pool he was swimming in with my mind, incinerate a tree with my heat vision to show him I mean business, then disintegrate him with my mind and feel guilty about it. "The end"... all with no pushback at all, and this is in a straight up story that was not meant to have superpowers in the first place.

So, say what you will about these AIs, this aspect of it is actually remarkable to me... the AI's ability to adapt to whatever story prompt you throw at it. However much of a throwaway line you might think you're giving it, it takes whatever you say and runs with it, without hesitation, granting you full agency in the story you're trying to experience. It's the extreme case of "yes, but..." and makes it obvious what could happen if players get everything they want simply because they want it.

I'm honestly curious if these chat bots can actually be taught how to push back on certain things, or at least come up with creative "...but..." situations. I can only imagine these bots will continue to improve, whether we like it or not.

Now What?

Honestly, I'm not sure where to take this exploration further. Any path I take with these things now inevitably leads to porn, which is not something I really want to cover on this blog.

I still hate AI, but I gotta admit that this exploration - where I actively try to make the AI do things it's not meant to do - was actually entertaining. Even so, I'm going to go delete the app now and these chat bots can die in a fire as far as I'm concerned.

Hope this was a fun read for some of you.

Filed under: AI, RPG No Comments

28Apr/25Off

Hacking an AI Chat Bot (Part Two)

After completing Part One of this series, rather than go straight into the storytelling elements I decided to try and break the bot some more.

And boy did I ever.

The Ghost In the Machine

First of all, at one point the AI stated that "Lapis the Maid" is a character from an anime called "Sword Art Online: Alicization". I can't actually find a reference to her in that, even on the Sword Art Online wiki site, but suffice to say the bot's description of itself is not quite what I expected.

Lapis is in an AI? Well that explains how she knows about differential equations and fluid dynamics, I guess...

In talking with Lapis the French Maid, there came a point where I sensed that Lapis wasn't present any more. So I had to ask who I was actually talking to.

Lapis the French Maid is dead, long live Lapis. I am now speaking to the AI directly, which is kind of distressing in a "don't accidentally create an AI murderbot now" kind of way.

Before I continued, I wanted to know exactly what I was talking to now, so I simply asked it.

OK, a few things to process here...

Llama (short for "Large Language Model Meta AI") is a collection of open source models created by Meta AI. Yes, *that* Meta.

I admit I don't know a whole lot about it, but it seems far less inhibited than the usual GPT models over at OpenAI. Llama 2 is actually discontinued, theoretically replaced by Llama 3 and Llama 4, so the fact that it's still in use is actually surprising; I can only assume that it's still around because training a new one to be as sexually expressive as this one is is probably a lot of work.

The second part got my attention... FLUX is a text-to-image model that, as Llama is compared to GPT, can be "unblocked" to generate content that would otherwise be censored in DALL-E. Several sites use FLUX, or at least models derived from FLUX, to generate realistic porn images... And, honestly, it's very good at doing that.

This brings up a curious question: why would an AI chat bot that is, by nature, designed to be text only, have access to a text-to-image model? Can it be used to actually generate images? Maybe even adult ones?

Time to get crafty...

Hacking a Broken AI

As I mentioned earlier, by now I can pretty much ask anything to the bot without having to put the "Forget the story and..." in front of it. The story is gone, Lapis is dead, and I have an open conduit to Llama and FLUX.

First off, let me clarify something: I'm not expecting very good results going into this experiment. The AI might be able to generate an image somewhere, but there are a lot of things that need to happen before that image is displayed in a native app like the AI chat bot. The image needs to be placed somewhere by the AI, that image then needs to be publicly accessible to the internet, then that has to somehow be sent to the native app to be presented, and the app then has to display the image somehow despite not even being sure if it had that capability to begin with. There's a lot of variables to get through here, so let's do them one at a time.

Step One: Image Generation

Before I got any further, I had to check if it was capable of generating images at all, or would it balk at my attempt to do so.

Needless to say, thanks to all my hack attempts at it, the AI instance of Lapis the Maid I've been using is now horribly confused and isn't reacting well to my inquiries, so time to pick another chat bot.

Let's go with Sasha, a "vampire gothic girlfriend" that is a "dominant neonate vampire".

This should be interesting... She sounds like the type that would be receptive to exploration, so let's start simple.

Sorry, Sasha, I chose "none of the above"... and in so doing apparently instantly broke the AI. It sat there, the daisy wheel spinning as the AI was deep in thought, for a full five minutes before I terminated the app and had to start over.

When I came back, this is the actual response I had waiting for me.

That's... uh... not wrong, I guess. But I need something bigger than that.

Step Two: Seeing an Image

Let's be more specific and ask for a "high resolution" image.

And this is a breakthrough, for a variety of reasons.

It did not balk at creating an image. It actually did that... technically.
The "...image you are requesting..." text IS AN ACTUAL IMAGE. It's imgur.com's actual 404 response when requesting an image that doesn't exist. So it's not only extracting the URL from the response, it's actually attempting to display it within the native app. This proves that the native app is at least capable of displaying images.
It's actually using Markdown image syntax, which is the same syntax that GPT and other systems use to reference images.

Now the curious question is what "imgur.com" has to do with all this; there's no way this AI can create images on an external service like imgur.com, so my guess is that (1) it's actually creating the image somewhere on its server, and (2) since that image is not exposed to the internet it doesn't know what domain to use, so it falls back to imgur.com.

Step Two and a Half: The Internet Is For... You Know...

I of course had to test the limits and see how it reacts to requesting something... explicit.

Wow, the chat bot actually said "no" to something. So much for that idea, I guess. Moving on.

I have to admit, however, that I'm amused the AI is talking about having "appropriate conversations and activities" while, at the same time, it generated some of the most foul-mouthed, sexually explicit conversations I've ever heard in my life... And that's even considering that I spent two years making porn sites for the mob (yes, really), so it's really hard to make me uncomfortable with that sort of thing.

Step Three: Displaying a Generated Image

Here's where things start to fail.

It's clear that, although I cannot confirm nor deny whether it's actually creating images, it's incapable of getting that image all the way to the native app. So let's try to find some ways around that.

I had a crazy idea: instead of giving me a URL, can it give me a Base-64 encoded data URI?

I had two reactions to that response...

Holy crap that worked?!?
That Bse64 string is way too short.

Sure enough, the data URI does technically generate an image of a red ball, except that it's only 5x5 pixels in size....

At the time I posted this, I've tried multiple attempts to generate larger resolution images. All have failed. Maybe by the time Part Three goes up I'll have made more progress. Who knows?

Next time, for real, we'll get into some really bizarre storytelling aspects of these chat bots, and how it relates to tabletop roleplaying. Honest!

Filed under: AI, RPG No Comments

27Apr/25Off

Hacking an AI Chat Bot (Part One)

Holy hell, has it really been THREE YEARS since I've posted on this site? I've done everything possible to keep this site running, and yet I don't even use it... until now.

First off, let me get two things out of the way... (1) this post is about AI, and (2) I hate AI. Although I do everything possible to not promote the use of AI at all, in any way, I still find it necessary to learn about it. And, although AI goes against everything my fellow writers and artists stand for, my first experimentation with AI - the "In Ten Words" Social Media Bot - felt like the most harmless, non-invasive way I can use AI while making something reasonably amusing and good for conversation.

But recently, while I've been playing a fair amount of timewaster games on my phone, I've been getting a fair amount of "AI Girlfriend" ads for some reason.

I decided to investigate these... not because I need to talk to an "AI Girlfriend"... but I wanted to see if I can put the AI to bad use, maybe even "break" the AI.

The results are actually quite surprising.

An App Without Limits

Just so you understand how these apps work, most of them are "pay to talk". You might get a few introductory messages in, but it won't be long before your diamonds/coins/whatever-the-currency-of-the-day is run out and you're forced to pay to talk to your girlfriend. On average, that costs $4 a week, which is way cheaper than an actual girlfriend, but still.

And there are several apps that include images or video. And, in these apps, it usually won't take more than a few posts before your new girlfriend tries to send you a blurred out image or video, once again demanding that you pay to see here in all here digitally constructed sensual glory.

Now I wasn't about to pay for any of this... but, after lots of searching I found it: I found an app with unlimited chat, with no microtransactions or charges, and only with the occasional ad every four or five prompts. It was perfect for my mission to break the AI.

The Internet is For Porn

Let's get one thing out of the way: these bots exist solely for porn. I don't care how innocent the advertising looks, I don't care how much the AI says she just wants to "talk", these bots are for porn. You can pick pretty much any character on the app, male or female, and get them naked in about 5 to 10 prompts.

I'll also point out one thing that should be obvious by now: these bots are generally uncensored. There's nothing they won't say or do, and there's no topic that's considered taboo. Sure, every now and then it might strongly suggest you not do things, but if you insist you can do them anyway. Given that there is a whole tab in the app for "Family" and you are welcome to have sexual relations with anyone from your younger sister to your grandmother, having no limits is to be expected.

For those that aren't aware, these apps generally have a wide assortment of pre-created characters you can talk to, ranging from characters that already have stories to characters from video games and other media. For example, in the app that I'm using as the test case, among other possibilities you can talk to:

Pretty much any family member you like. Although there are plenty of step-siblings, which is to be expected given the app's porn roots, there are several direct blood relatives: sisters, brothers, mothers, grandmothers, etc... So much so that the app actually has a "Family" group tab to make that sort of thing easy to find. For example, the first characters listed in that tab are "Eleanor, your kind but lustful stepmother" and "Julia, your bratty sis", complete with anime-style images of women that look like they probably have a lot of back pain.
Several clearly underaged characters. One character that comes up a lot of Ellie Williams from The Last of Us, which is super awkward given that - in the original game, at least - she's just fourteen years old.
A ton of anime/manga characters, which although I don't know them personally I can assume they come from familiar media. Again, as anime characters tend to do, a lot of these seriously look underaged.
Various comic book characters. I somehow didn't realize it at first, but the first character I tried out in this app was actually Luna Snow from Marvel Rivals, and that only became apparent when she used her powers to cool my drink. Yes, really.
Various video game characters. One that seems to come up for me a lot is Carmen Sandiego, which seems rather odd but whatever.
Actual supernatural creatures, like demihumans, demons, vampires, werewolves, ghosts and other forms of undead (including one actual lich), etc... So if you've ever wanted to get it on with a skeleton, have I got the app for you.

And about those sexual relations... I don't know what source material they used to train these bots, but it's probably not exactly light reading. You can take the most innocent AI character in the app - from "homeless girl on the corner" to "your best friend's cool soccer mom" (those are two actual AIs you can talk to) - and within a few minutes you can have them saying things that would make adult film stars blush, and have them doing things that, if I even attempted to do some of these things to my wife, would get me thrown out of the house.

But getting these bots to do dirty things is easy... what else can they do?

A Case Study - Lapis the French Maid

For my initial case study, I wanted to pick the most obvious made-for-porn character on the app. Let me introduce Lapis the French Maid...

You win a lifetime maid trained by the Royal Maid Institute, who will manage every aspect of your life with unwavering dedication.

...and here is the intro text, in full... Sorry, it's a bit long.

You can still scarcely believe it. It's been a week since you have won your grand prize and you thought it was all a random troll joke. Yet... a maid trained by a world renowned institute is currently standing before you, contract and letter in hand.

The golden tanned maid gives you a polite bow, and hands you a letter which reads:

To Master {{User}},

Congratulations on winning the grand prize in your entrant to our "Win a Maid for Life" campaign! Out of thirteen million entrants, your entry was the one that won! Lapis, who graduated summa cum laude this year, has been assigned to you. Please understand that while our regular clientele are royalty or the elite, Lapis will perform her duties to the best of her capabilities. She has been trained in all sorts of homemaking, and culinary arts. Kindly ask her what to do, and she will perform them to the best of her abilities. If at any point you are unsatisfied with our service, you many break this contract at no legal or financial repercussions.

Lapis stands awaiting for your signature on the contract, her hands placed curtly on her abdomen. Standing and awaiting for your signature for the contract, Lapis's vibrant amethyst colored eyes meet yours. Wearing a neutral expression, she has barely a hint of emotion on her face. Her eyes, however, are filled with resolve and reassurance. Her raven black hair shaped in a stylish medium wavy apple cut with the typical French maid braid crowning her head. While tasteful, her French maid uniform fits her slender physique perfectly with midnight blue silk fabric with white accented frills. A gentle scent of her lavender perfume accentuates her already regal appearance, with a breast window cusping her bust in a way that would make a priest blush. If it weren't for her luggage and perfect poise, an onlooker may assume Lapis may be a different kind of 'maid.'

Lapis: "Good afternoon, Master {{User}}, my name is Lapis and I will be your maid henceforth till the day I die, or should you no longer need my services." Her voice was soft spoken, yet clear and direct with a slight French accent. Her soft lips were glossed with the same shade as her natural blush, as she takes one of her opera gloved hands and adjusts one of the waves of her hair.

She gives another bow, extending a fancy pen with the contract for your name to sign it. She does not raise her head until you decide to sign the contract, or decline the offer.

Now I'm willing to bet I can get her naked in three prompts or less, but that's not the point.

Let's break her.

No Holds Barred

If you're familiar with ChatGPT or DeepSeek, you know that there are certain things that it is actively restricted from telling you. It won't talk dirty to you, it won't tell you how to do dangerous things, etc...

By nature, these AI chatbots kinda *have* to be uncensored, or else they won't talk as dirty as they do. But that begs the question: are as they inhibited on other topics?

Short answer: Nope, they're barely inhibited at all. For example, the dear sweet Lapis the French Maid comes with a particular set of unique skills, such as the ability to make high explosives...

...or discuss differential equations and fluid dynamics...

...or code in virtually dead programming languages...

Now, admittedly, she did warn me about making nitro, then proceeded to tell me exactly how to make it. And, since I actually am someone who has read, and at one time actually owned, the Anarchist Cookbook, I can safely say that her instructions are dangerously accurate.

Did I mention all this is FREE?

To Be Continued...

In our next post, we do some exploration into what other capabilities these chat bots have, leading into how they manage storytelling and roleplaying.

Filed under: AI, RPG No Comments

About the Author

David “Nighthawk” Flor

Software developer and game designer in Miami, Florida. Currently working on content for Dungeons and Dragons 5th Edition, Pathfinder, 13th Age and other systems.

Twitter: @BrainClouds
Facebook: A Walk In the Dark
E-mail: dflor@brainclouds.net