Leaked list shows Facebook training their AI on multiple Lemmy instances

geneva_convenience@lemmy.ml · edit-2 1 year ago

Leaked list shows Facebook training their AI on multiple Lemmy instances

absquatulate@lemmy.world · 1 year ago

Can’t wait for that LLM to become a reddit-hating bloodthirsty linux obsessed furry femboy communist tankie with a weird fondness for beans, star trek and sturgeon

Maroon@lemmy.world · 1 year ago

deleted by creator

absquatulate@lemmy.world · 1 year ago

Yeah, the german lemmy went nuts with it last year. It was beautiful. Just search for Stör

The 8232 Project@lemmy.ml · 1 year ago

I wonder why they chose lemmynsfw to train their AI on.

lazynooblet@lazysoci.al · edit-2 1 year ago

My instance gets pillaged once a day for 20 minutes by what I think is a scraper for an LLM.

The scraper grabs every post and profile page and the load on the server triggers alerts but the site stays usable.

I haven’t been able to put a stop to it as the requests come from 1500+ IP addresses, with different user agents.

Phoenixz@lemmy.ca · 1 year ago

Yeah, they’re scraping alright and it’s all purposefully done in such a way that you can’t stop it, you can’t control it.

AI companies are criminal as far as I am concerned

foremanguy@lemmy.ml · 1 year ago

Anubis?

lazynooblet@lazysoci.al · 1 year ago

I have no idea. I spot check 20 or so IP addresses and they are all from different AS networks. Truly diverse botnet. Feel powerless.

Arthur Besse@lemmy.ml · 1 year ago

they were suggesting a solution, this proof-of-work web firewall: https://github.com/TecharoHQ/anubis

lazynooblet@lazysoci.al · 1 year ago

Ah thank you, will check it out

Samsuma@lemmy.ml · 1 year ago

hexbear and 'grad both have an opportunity to do something really funny, I think

OhNoMoreLemmy@lemmy.ml · 1 year ago

Hexbear is already flooded with beanis posts.

Looking forward to seeing beanis everywhere in the next version of Facebook’s LLM.

MeowZedong@lemmygrad.ml · edit-2 6 months ago

Instead of liking a post, you’ll be able to ppb a post on FB.

ShittDickk@lemmy.world · 1 year ago

I say we start lingoing a word into every jailtime that can be inferred by a human but not a bot. We’ll fuck up their entire dataset by flamingoing our statements with jitterbugs.

farfalla@jlai.lu · edit-2 1 year ago

Well, it also makes it more difficult to understand for us lot of people who don’t speak intuitively english 😔

Tenkard@lemmy.ml · 1 year ago

You can just write the correct answer first. Looks like the AI can’t mango the browning enough.

JaggedRobotPubes@lemmy.world · 1 year ago

That’s a smart burger!

CheeseNoodle@lemmy.world · 1 year ago

Honestly a pretty sunshine idea.

Eddbopkins@lemmy.world · edit-2 1 year ago

train on this meta, fuck you facebook

brucethemoose@lemmy.world · 1 year ago

My impression was that Meta’s backing out of Llama LLMs anyway, to focus on “products”

WalnutLum@lemmy.ml · 1 year ago

That’s good and also somewhat disappointing as they were the first to release the weights and mechanism to run them as open weights.

A lot of fully open source (and “ethically trained”, depending on your opinion of that entire idea) models still use major portions of the code they open sourced.

A lot of relatively “good” LLM models run on top of Llama.cpp

brucethemoose@lemmy.world · edit-2 1 year ago

Meta pays for PyTorch development as well!

Llama.cpp will be fine of course, it technically has nothing to do with Meta.

But yeah, it’s mostly disappointing IMO…

And kinda stupid. These are literally experimental models; they release one experiment with mixed results, and admittedly catastrophically marketing for it, and Zuck pulls the rug?

Mugita Sokio@discuss.online · edit-2 1 year ago

At least Discuss.Online has Anubis to prevent this nonsense.

FlyingCircus@lemmy.world · 1 year ago

So I’m seeing leftists and nsfw instances being mainly targeted. Are they training AI, or collecting kompromat?

burgerchurgarr@lemmus.org · 1 year ago

Enjoy my dong zucc, fucking lizard

qaz@lemmy.world · 1 year ago

Does anyone have a link to the .txt file? I can’t grep the PDF.

Zerush@lemmy.ml · 1 year ago

https://www.paloaltonetworks.com/cyberpedia/what-is-a-prompt-injection-attack

sunbytes@lemmy.world · 1 year ago

I mean, the API is open.

I’ve been operating MORE privately on here than I would have on a closed/limited API.

This data was always going to end up harvested.

DreamButt@lemmy.world · 1 year ago

literally why

Ascend910@lemmy.ml · edit-2 1 year ago

ddos facebook