A class action lawsuit from publishers and authors targeting Meta's Llama AI directly names the use of pirated library datasets (LibGen et al.) as the training corpus for one of the most widely-used open AI models. The personal naming of Zuckerberg signals intent to pierce the corporate veil. This is one of the highest-stakes AI copyright cases in publishing history — outcomes will shape how the entire industry frames licensing, TDM rights, and AI training data provenance. Essential coverage for P2P Review.

Source: Publishing Perspectives