Добавить новость
Новости сегодня

Новости от TheMoneytizer

Nvidia allegedly greenlit the use of pirated books from illegal sources to train its AI models, according to an expanded class-action lawsuit

The capabilities of AI models, such as GPT-5, Gemini, Claude, and Grok, lie in the size and scope of the dataset used to train them. This has also been the source of multiple lawsuits, claiming that the companies performing the training had no right to freely use the data. In an expanded class-action case against Nvidia, however, the accusation goes one step further, with claims that the GPU giant willingly used an illegal source of pirated books to train its models.

As reported by TorrentFreak, an amended complaint (pdf warning) filed at the district court in Oakland, California last week, specifically claims that staff at Nvidia contacted a so-called 'shadow library' known as Anna's Archive, a repository of pirated books and other documents.

The plaintiffs cite internal Nvidia communications as evidence, with the filed document purporting to show someone from the data strategy team at Nvidia writing, "we are exploring including Anna's Archive in pre-training data for our LLMs."

It continues with "We are figuring out internally whether we are willing to accept the risk of using this data, but would like to speak with your team to get a better understanding of LLM-related work you have done."

While Anna's Archive appears not to host any content directly itself, it does act as a 'search engine' for alleged pirate libraries. These third-party hosts aren't exclusively providing access to copyrighted materials, but that content is what they are most infamous for.

The original complaint against Nvidia was filed back in 2024, and as Torrent Freak reported at the time, Nvidia's response was essentially to claim that AI training on such material is not the same as owning an illegally obtained book, or even using it as a human does. "Training measures statistical correlations in the aggregate, across a vast body of data, and encodes them into the parameters of a model," it wrote in response.

In essence, Nvidia is saying that the use of such datasets falls under fair use. Given that the original complaint involved data garnered from another pirated source (Books3), it's possible that Nvidia may choose to use the same counterargument from 2024.

Similar claims have been filed against Anthropic and Meta in the past, and in the case of the former, the court judge ruled that while accessing the data did fall under fair use, "Anthropic had no entitlement to use pirated copies for its central library." How the case against Nvidia will fare, well, we'll just have to wait and see.

Читайте на сайте


Smi24.net — ежеминутные новости с ежедневным архивом. Только у нас — все главные новости дня без политической цензуры. Абсолютно все точки зрения, трезвая аналитика, цивилизованные споры и обсуждения без взаимных обвинений и оскорблений. Помните, что не у всех точка зрения совпадает с Вашей. Уважайте мнение других, даже если Вы отстаиваете свой взгляд и свою позицию. Мы не навязываем Вам своё видение, мы даём Вам срез событий дня без цензуры и без купюр. Новости, какие они есть —онлайн с поминутным архивом по всем городам и регионам России, Украины, Белоруссии и Абхазии. Smi24.net — живые новости в живом эфире! Быстрый поиск от Smi24.net — это не только возможность первым узнать, но и преимущество сообщить срочные новости мгновенно на любом языке мира и быть услышанным тут же. В любую минуту Вы можете добавить свою новость - здесь.




Новости от наших партнёров в Вашем городе

Ria.city
Музыкальные новости
Новости России
Экология в России и мире
Спорт в России и мире
Moscow.media






Топ новостей на этот час

Rss.plus





СМИ24.net — правдивые новости, непрерывно 24/7 на русском языке с ежеминутным обновлением *