Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

lemmyreader@lemmy.ml · 2 years ago

Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT

stembolts@programming.dev · edit-2 2 years ago

This is similar to when I heard reddit was doing the API lockdown, I wrote an automation bot over the weekend that self-destructed my subreddit and the entire post history. The bot also automatically downloaded and archived all of the content on my local machine.

It was annoying because at first I couldn’t get access to older posts since at the time reddit had changed their API to only show the first X posts (100 or 1,000 or whatever). So I told my bot to delete the posts as it archived them so as I deleted content, reddit had no choice but to populate the page with the older posts.

And that’s how I archived my subreddit. Reddit banned me two days later for automation, lol. I did not break any of the reddit or reddit api ToS during this process but I guess I upset someone.

verassol@lemmy.ml · 2 years ago

StackOverflow: *grabs money on monetizing massive amounts of user-contributed content without consulting or compensating the users in any way*

Users: *try to delete it all to prevent it*

StackOverflow: *your contributions belong to the community, you can’t do that*

Pretty fucked-up laws. A lot of lawsuits going on right now against AI companies for similar issues. In this case, StackOverflow is entitled to be compensated for its partnership, and because the answers are all CC BY-SA 3.0, no one can complain. Now, that SA? Whatever.

9point6@lemmy.world · 2 years ago

That SA part needs to be tested in court against the AI models themselves

A lot of this shittiness would probably go away if there was a risk that ingesting certain content would mean you need to release the actual model to the public.

verassol@lemmy.ml · edit-2 2 years ago

Yeah, their assumption though is you don’t? Neither attribution nor sharealike, not even full-on all-rights-reserved copyright is being respected. Anything public goes and if questions are asked it’s “fair use”. If the user retains CC BY-SA over their content, why is giving a bunch of money to StackOverflow entitling OpenAI to use it all under whatever terms they settled on? Boggles me.

Now, say, Reddit Terms of Service state clearly that by submitting content you are giving them the right to “a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness (…) in all media formats and channels now known or later developed anywhere in the world.” Speaks volumes on why alternatives (like Lemmy) to these platforms matter.

delirious_owl@discuss.online · 2 years ago

Like AI doesn’t know how to use the way back machine?

davel@lemmy.ml · 2 years ago

Good luck with the deleting. It often just means UPDATE comments SET is_deleted = 1 WHERE ID = 666;.

plz1@lemmy.world · 2 years ago

They are not deleting, they are editing. So the platform would have to undo those edits rather than just flipping the visibility flag.

paraphrand@lemmy.world · 2 years ago

And they are. 😞

delirious_owl@discuss.online · 2 years ago

This isn’t really comparable to reddit, since users can just send a request to SO for all the content. Reddit locking down the API meant we lost access to our content.

FenrirIII@lemmy.world · 2 years ago

If you get something for free, you are the product

helenslunch@feddit.nl · edit-2 1 year ago

deleted by creator

zovits@lemmy.world · 2 years ago

SO has mechanisms in place to filter out AI-generated content.

helenslunch@feddit.nl · edit-2 1 year ago

deleted by creator

zovits@lemmy.world · 2 years ago

https://meta.stackoverflow.com/questions/421831/policy-generative-ai-e-g-chatgpt-is-banned

helenslunch@feddit.nl · edit-2 1 year ago

deleted by creator

zovits@lemmy.world · 2 years ago

Ah, I think I got the source of misunderstanding: these mechanisms are not automated, but implemented as moderation guidelines and rules.

fluxc0@lemmy.world · 2 years ago

This feels a little iffy to me. it rings of what happened with reddit.

baseless_discourse@mander.xyz · edit-2 2 years ago

This is a violation of GDPR, no?

EDIT: user created content is not directly protected under GDPR, only personally identifiable data is pertected under GDPR.

lemmyreader@lemmy.ml · 2 years ago

Dunno. GDPR is a Europe only thing, and isn’t it only related to how your private data (like name, IP address, phone number) is cared about ?

refalo@programming.dev · 2 years ago

How does GDPR get away with not defining what a website is when referring to them directly in the law? Like what counts, only html? http? ftp? gopher?

Modern_medicine_isnt@lemmy.world · 2 years ago

So what is the stack overflow replacement?

Weslee@lemmy.world · 2 years ago

Maybe https://www.codidact.com/

HexesofVexes@lemmy.world · 2 years ago

I mean, here is a thought, if an AI tool uses creative commons data, then it’s derivatives fall under creative commons. I.e. stop charging for AI tools and people will stop complaining.

drunkpostdisaster@lemmy.world · 2 years ago

This shit scares me. It will become so easy to rewrite history from here. Just delete anything you don’t like and have an ai rewrite into whatever you want. Entire threads rewritten, a company can go back and have your entire post history can be changed in ways that might be legally compromising.