doesn’t it follow that AI-generated CSAM can only be generated if the AI has been trained on CSAM?
This article even explicitely says as much.
My question is: why aren’t OpenAI, Google, Microsoft, Anthropic… sued for possession of CSAM? It’s clearly in their training datasets.
Training an existing model on a specific set of new data is known as “fine tuning”.
A base model has broad world knowledge and the ability to generate outputs of things it hasn’t specifically seen, but a tuned model will provide “better” (fucking yuck to even write it) results.
The closer your training data is to your desired result, the better.