Unfortunately the safety filters have enough false positives (basically any imag...

netruk44 · on Nov 18, 2022

That'll only work for a little while longer (for future named big-public-release models, obviously the cat's out of the bag for the current version of stable diffusion), right up until the point where they incorporate the filter into the training process.

At which point, the end model users get to download will be incapable of producing anything that comes close to triggering the filter, and there will be no way to work around it short of training/fine-tuning your own model, which is prohibitively expensive for 'normal' people, even people with top-of-the-line graphics cards like a 4090.

Animats · on Nov 18, 2022

That problem is being solved. Pornhub now has an AI R&D unit.[1] Their current project is to upscale and colorize out of copyright vintage porn. As a training set, they use modern porn. They point out that they have access to a big training set.

Next step, porn generation.

[1] https://www.pornhub.com/art/remastured

sillysaurusx · on Nov 18, 2022

For a glimpse at what’s possible:

https://www.reddit.com/r/unstablediffusion

https://www.reddit.com/r/aiwaifu

I’ve been trying to generate tentacle porn since 2019 or so. It’s the whole reason I got into AI. We’re finally there, and it only took three years.

Can’t wait to see what 2026 brings. http://n.actionsack.com/pic/media%2FFh08F_hXkAAhalt.jpg

GuB-42 · on Nov 18, 2022

> https://www.reddit.com/r/unstablediffusion

This subreddit was banned due to a violation of Reddit's rules against non-consensual intimate media.

Interesting. Why "non-consensual"? Does it mean Stable Diffusion generated porn of people who actually exist?

sillysaurusx · on Nov 18, 2022

Sorry all, I was typing it on my phone and missed an underscore. Here’s the proper link:

https://www.reddit.com/r/unstable_diffusion

sbierwagen · on Nov 18, 2022

Yes, reddit routinely bans deepfake subreddits. In practice, this means any net that can produce output that looks like any living person is banned.

emmelaich · on Nov 18, 2022

unstable_diffusion is still around. Note the underscore.

pessimizer · on Nov 18, 2022

> Their current project is to upscale and colorize out of copyright vintage porn.

But not very well. I collect this stuff and I have my own copies, so I can tell you that this doesn't look better than the b/w originals in quality/detail, and it's easy to see that the color is not great, especially if there are lots of hard lights and shadows dancing around.

That being said, I don't know why it's not working. Seems like it should work. I'd expect it to at least be clean of scratches and stabilized. Any relevant papers I should read about AI restoration of old film?

SequoiaHope · on Nov 18, 2022

This prediction doesn’t track with what is already happening. Dreambooth is allowing all kinds of people to fine tune their own models at home with nvidia graphics cards, and people are sharing all kinds of updated models that do really well at specific art styles or with NSFW subjects. Go check the nsfw subreddit unstable_diffusion for examples. It seems lots of people are training nsfw models with their own preferred data sets and last I saw someone merged all those checkpoints together in to one model.

So if I made a prediction it would be that the training sets for open models from big companies will get scrubbed of nsfw content and then nerds on Reddit will just release their own versions with it added in, and the big companies will make sure everyone knows they didn’t add that stuff and that’s where it will stand.

netruk44 · on Nov 18, 2022

I agree with your prediction. Sorry, I was unclear in my post, and left that part unsaid. I agree that it will likely just be the big newly released 'base' models that will be scrubbed of NSFW images, but there's really no way to prevent these models from making those kinds of images at all.

It will only take some dedicated individuals, which I know there is no shortage of.

langitbiru · on Nov 18, 2022

The AI-generated art with Dreambooth works only for avatar type pics. It cannot create fancy gestures (doing a complicated movement with hands, like patting a cat). For now.

pifm_guy · on Nov 18, 2022

Fine-tuning is pretty cheap compared to the original training run - perhaps just 1% of the cost.

Totally within reach of a consortium of.... "entertainment specialists".

netruk44 · on Nov 18, 2022

I know a person who fine-tuned stable diffusion, and he said it took 2 weeks of 8xA100 80 GB training time, costing him somewhere between $500-$700 (he got a pretty big discount, too, at today's prices for peer GPU rental it would be over $1,000).

Sure, it's peanuts compared to what it must have cost to train stable diffusion from scratch. However, I think most normal people would not consider spending $500 to fine-tune one of these.

Edit: Though I do agree that once this kind of filtering is in place during training, NSFW models will begin to pop up all over the place.

cookingrobot · on Nov 18, 2022

You can fine tune stable diffusion for $10 using this service: https://www.strmr.com/

It works super well for putting yourself in the images, the likeness is fantastic.

It’s obviously a small training process, they only take 20 images, but it works.

minimaxir · on Nov 18, 2022

For spot-finetuning with Dreambooth (not as good as full-finetuning but can get a specific subject/style much faster), it can be done with about $0.08 of GPU compute, although optimizing it is harder.

https://huggingface.co/docs/diffusers/training/dreambooth

netruk44 · on Nov 18, 2022

Are these services using textual-inversion? If so, I have to wonder how well they would work on a stable diffusion model that was trained with the filter in place from the start, so that it couldn't generate anything close to the filter.

As it is right now, stable diffusion can generate adult imagery by itself, however it seems like it's been fine-tuned after the fact to try to 'cover up' that fact as much as they could before releasing the model publicly.

seaal · on Nov 18, 2022

I believe the safety filter is trivial to disable since it was added in one of the last commits prior to Stable Diffusion’s public release and not baked into the model, therefore most forks just remove the safety checker code [1]

As far as textual inversion, JoePenna’s Dreambooth [2] implementation uses Textual Inversion.

[1] https://github.com/CompVis/stable-diffusion/commit/a6e2f3b12... [2] https://github.com/JoePenna/Dreambooth-Stable-Diffusion

gpderetta · on Nov 18, 2022

As far as I understand textual inversion != Dreambooth != Actual fine-tuning

xtagon · on Nov 19, 2022

This (training a model with no NSFW content) would be preferable to me. No false positives to worry about. People who do want to generate NSFW stuff can fine-tune or train their own model, nobody owes that functionality to them in freely available ones.

ben_w · on Nov 18, 2022

Training's only prohibitively expensive for normal people today, and the dollar cost per compute operation is still decreasing fairly rapidly.

langitbiru · on Nov 18, 2022

"Gal Gadot wearing green suit" triggered it while "Tom Cruise wearing green suit" didn't.

Wistar · on Nov 18, 2022

As did "young children watching sunset," but not "young boy and girl watching sunset."

naet · on Nov 18, 2022

Might be the word "gal" (which can mean girl or young woman).

langitbiru · on Nov 19, 2022

Does the picture or the prompt trigger the filter?

nonethewiser · on Nov 19, 2022

Adult porn is filtered too.

criddell · on Nov 18, 2022

And then the CSAM filter on your device reports you to some authority.

iceburgcrm · on Nov 18, 2022

Only available on the latest iphone

Gigachad · on Nov 18, 2022

Apple ended up not implementing that iirc. While Google Photos has had it the whole time.

Googles is actually worse. Apple was only going to match against known CSAM images while google has ML to identify new images which resulted in one parent being arrested for a medical image of their own child.

dvngnt_ · on Nov 19, 2022

was never arrested. he was cleared but account is nuked

knaik94 · on Nov 19, 2022

I am fine with photos that are uploaded to the cloud being scanned. I do not want my own device spending energy and scanning images even before they completely leave my network or device. Google scanning my Google drive files is fine with me. Apple is much worse.

Gigachad · on Nov 19, 2022

The apple one was only going to scan photos stored on iCloud. It scans them on your phone but if you don’t use iCloud it doesn’t scan anything. It’s a neat trick that means the Apple servers can know they aren’t storing illegal content without ever having to look at it.

If it’s anything like the regular scanning iPhones do, it’s done overnight while plugged in.

knaik94 · on Nov 20, 2022

That makes the phone the scanner and by definition it's before iCloud. Apple makes it very hard to use an iPhone/iPad without keeping iCloud enabled. I don't want my phone spying on me, that sets a dangerous precedent. I don't care how many times it is scanned by Apple after I store it unencrypted on iCloud servers, but scanning without my permission on my device before it leave my device is a violation of my privacy. Apple will look at it if your phone detects something and it will inform authorities without letting you know. With other companies, once it is detected on their server, they forward your information to government authorities. Apple instead wanted to have an inhouse team to filter false positives before notifying authorities. Apple is worse in every single way.

BoorishBears · on Nov 18, 2022

If you have things on your device that match entries in the CSAM database, yes there's a chance you're a victim of a targeted attack taking advantage of highly experimental collisions... but the odds you "accidentally generated" that content are not realistic.

yeet_yeet_yeet · on Nov 18, 2022

>not realistic

The odds are zero.

1/2^256 = 0.

In cryptography these odds are treated as zero until you generate close to 2^128 images.

Unfortunately there's no word in natural English to describe how unlikely. The most precise is "zero".

ben_w · on Nov 18, 2022

Are you assuming that digital images are evenly distributed over the set of all possible 256 bit vectors?

Because I don't think that's a reasonable assumption.

Even if image recognition was perfectly solved with no known edge cases (ha!), when an entire topic is a semantic stop-sign for most people, you can't expect the mysterious opaque box that is a guilty-enough-to-investigate detection mechanism to be something that gets rapid updates and corrections when new failure modes are discovered.

jerf · on Nov 18, 2022

You should spend some time with an internet search engine and the term "perceptual hashing". What you're talking about is another type of hashing, which can be useful for classifying image files, but not images. The former has a very concrete definition that is specified down to the bit; the latter is a fuzzy space because it's trying to yield similar (not necessarily identical) hashes for images that humans consider similar. Much different space, much different problem, much different collision situation. Cryptographic hashing is not the only kind of hashing.

yeet_yeet_yeet · on Nov 18, 2022

Oh wow https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni... so they essentially just use CNN output to automatically determine whether to report people to the authorities? For some reason I assumed they were just comparing the files they knew to be CSAM.

Yeah that's bad. What about deepdream/CNN reversing? Couldn't a rogue apple engineer just create a innocuous looking false positive, say a cat picture, share it on Reddit, and everybody who downloads it is flagged to police for CSAM?

astrange · on Nov 19, 2022

No, there are two hashes used in the Apple system, one public and neural and one hidden, the intent of both is to match specific known images and not unknown new ones, and the result of passing both hashes is a manual review and not automatic reporting. I've never seen a published attack that would actually be a problem; they all misread how the system worked.

(Also, it's not reported to the police but to NCMEC, which is not a government agency. This is for 4th amendment privacy reasons.)

criddell · on Nov 18, 2022

The CSAM flagging generally isn’t reported to police to prevent the situation you describe. Google would get the report and once some threshold is reached, a person reviews the report(s) and decides if the police are notified.

criddell · on Nov 18, 2022

How can you be so sure? As I understand it, the hash is of features in the image and not the image itself. Are the CSAM feature detection heuristics public?