This is surely trivial to detect. If the number of pages on the site is greater than some insanely high number then just drop all data from that site from the training data.
It’s not like I can afford to compete with OpenAI on bandwidth, and they’re burning through money with no cares already.





How much does it cost if οne person imagines it, generates an image, shares that, and 10k people see the image, and avoid imagining it?