• stembolts@programming.dev
    link
    fedilink
    arrow-up
    69
    ·
    edit-2
    6 months ago

    This is similar to when I heard reddit was doing the API lockdown, I wrote an automation bot over the weekend that self-destructed my subreddit and the entire post history. The bot also automatically downloaded and archived all of the content on my local machine.

    It was annoying because at first I couldn’t get access to older posts since at the time reddit had changed their API to only show the first X posts (100 or 1,000 or whatever). So I told my bot to delete the posts as it archived them so as I deleted content, reddit had no choice but to populate the page with the older posts.

    And that’s how I archived my subreddit. Reddit banned me two days later for automation, lol. I did not break any of the reddit or reddit api ToS during this process but I guess I upset someone.

  • verassol@lemmy.ml
    link
    fedilink
    arrow-up
    29
    ·
    6 months ago

    StackOverflow: *grabs money on monetizing massive amounts of user-contributed content without consulting or compensating the users in any way*

    Users: *try to delete it all to prevent it*

    StackOverflow: *your contributions belong to the community, you can’t do that*

    Pretty fucked-up laws. A lot of lawsuits going on right now against AI companies for similar issues. In this case, StackOverflow is entitled to be compensated for its partnership, and because the answers are all CC BY-SA 3.0, no one can complain. Now, that SA? Whatever.

    • 9point6@lemmy.world
      link
      fedilink
      arrow-up
      12
      ·
      6 months ago

      That SA part needs to be tested in court against the AI models themselves

      A lot of this shittiness would probably go away if there was a risk that ingesting certain content would mean you need to release the actual model to the public.

      • verassol@lemmy.ml
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        6 months ago

        Yeah, their assumption though is you don’t? Neither attribution nor sharealike, not even full-on all-rights-reserved copyright is being respected. Anything public goes and if questions are asked it’s “fair use”. If the user retains CC BY-SA over their content, why is giving a bunch of money to StackOverflow entitling OpenAI to use it all under whatever terms they settled on? Boggles me.

        Now, say, Reddit Terms of Service state clearly that by submitting content you are giving them the right to “a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness (…) in all media formats and channels now known or later developed anywhere in the world.” Speaks volumes on why alternatives (like Lemmy) to these platforms matter.

  • davel@lemmy.ml
    link
    fedilink
    English
    arrow-up
    23
    arrow-down
    3
    ·
    6 months ago

    Good luck with the deleting. It often just means UPDATE comments SET is_deleted = 1 WHERE ID = 666;.

  • delirious_owl@discuss.online
    link
    fedilink
    arrow-up
    13
    ·
    6 months ago

    This isn’t really comparable to reddit, since users can just send a request to SO for all the content. Reddit locking down the API meant we lost access to our content.

  • baseless_discourse@mander.xyz
    link
    fedilink
    arrow-up
    11
    ·
    edit-2
    6 months ago

    This is a violation of GDPR, no?

    EDIT: user created content is not directly protected under GDPR, only personally identifiable data is pertected under GDPR.

    • lemmyreader@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 months ago

      Dunno. GDPR is a Europe only thing, and isn’t it only related to how your private data (like name, IP address, phone number) is cared about ?

    • refalo@programming.dev
      link
      fedilink
      arrow-up
      1
      ·
      6 months ago

      How does GDPR get away with not defining what a website is when referring to them directly in the law? Like what counts, only html? http? ftp? gopher?

  • HexesofVexes@lemmy.world
    link
    fedilink
    arrow-up
    3
    ·
    6 months ago

    I mean, here is a thought, if an AI tool uses creative commons data, then it’s derivatives fall under creative commons. I.e. stop charging for AI tools and people will stop complaining.

  • drunkpostdisaster@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    6 months ago

    This shit scares me. It will become so easy to rewrite history from here. Just delete anything you don’t like and have an ai rewrite into whatever you want. Entire threads rewritten, a company can go back and have your entire post history can be changed in ways that might be legally compromising.