This is similar to when I heard reddit was doing the API lockdown, I wrote an automation bot over the weekend that self-destructed my subreddit and the entire post history. The bot also automatically downloaded and archived all of the content on my local machine.
It was annoying because at first I couldn’t get access to older posts since at the time reddit had changed their API to only show the first X posts (100 or 1,000 or whatever). So I told my bot to delete the posts as it archived them so as I deleted content, reddit had no choice but to populate the page with the older posts.
And that’s how I archived my subreddit. Reddit banned me two days later for automation, lol. I did not break any of the reddit or reddit api ToS during this process but I guess I upset someone.
StackOverflow: *grabs money on monetizing massive amounts of user-contributed content without consulting or compensating the users in any way*
Users: *try to delete it all to prevent it*
StackOverflow: *your contributions belong to the community, you can’t do that*
Pretty fucked-up laws. A lot of lawsuits going on right now against AI companies for similar issues. In this case, StackOverflow is entitled to be compensated for its partnership, and because the answers are all CC BY-SA 3.0, no one can complain. Now, that SA? Whatever.
That SA part needs to be tested in court against the AI models themselves
A lot of this shittiness would probably go away if there was a risk that ingesting certain content would mean you need to release the actual model to the public.
Yeah, their assumption though is you don’t? Neither attribution nor sharealike, not even full-on all-rights-reserved copyright is being respected. Anything public goes and if questions are asked it’s “fair use”. If the user retains CC BY-SA over their content, why is giving a bunch of money to StackOverflow entitling OpenAI to use it all under whatever terms they settled on? Boggles me.
Now, say, Reddit Terms of Service state clearly that by submitting content you are giving them the right to “a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness (…) in all media formats and channels now known or later developed anywhere in the world.” Speaks volumes on why alternatives (like Lemmy) to these platforms matter.
Like AI doesn’t know how to use the way back machine?
Good luck with the deleting. It often just means
UPDATE comments SET is_deleted = 1 WHERE ID = 666;
.They are not deleting, they are editing. So the platform would have to undo those edits rather than just flipping the visibility flag.
And they are. 😞
This isn’t really comparable to reddit, since users can just send a request to SO for all the content. Reddit locking down the API meant we lost access to our content.
If you get something for free, you are the product
deleted by creator
SO has mechanisms in place to filter out AI-generated content.
deleted by creator
deleted by creator
Ah, I think I got the source of misunderstanding: these mechanisms are not automated, but implemented as moderation guidelines and rules.
This feels a little iffy to me. it rings of what happened with reddit.
This is a violation of GDPR, no?
EDIT: user created content is not directly protected under GDPR, only personally identifiable data is pertected under GDPR.
Dunno. GDPR is a Europe only thing, and isn’t it only related to how your private data (like name, IP address, phone number) is cared about ?
How does GDPR get away with not defining what a website is when referring to them directly in the law? Like what counts, only html? http? ftp? gopher?
So what is the stack overflow replacement?
I mean, here is a thought, if an AI tool uses creative commons data, then it’s derivatives fall under creative commons. I.e. stop charging for AI tools and people will stop complaining.
This shit scares me. It will become so easy to rewrite history from here. Just delete anything you don’t like and have an ai rewrite into whatever you want. Entire threads rewritten, a company can go back and have your entire post history can be changed in ways that might be legally compromising.