Nvidia reportedly caught scraping AI data from YouTube again

Pelican Press · August 5, 2024

This is the hidden content, please

Nvidia reportedly caught scraping AI data from

This is the hidden content, please

Sign In

or

Sign Up

again

data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///ywAAAAAAQABAAACAUwAOw==

Nvidia

According to

This is the hidden content, please

, backed with internal Slack chats, emails, and documents obtained by the outlet, Nvidia helped itself to “a human lifetime visual experience worth of training data per day,” Ming-Yu Liu, vice president of Research at Nvidia and a Cosmos project leader, admitted in a May email.

Unnamed former Nvidia employees told 404 that they had been asked to scrape video content from

This is the hidden content, please

,

This is the hidden content, please

, and other online sources in order to obtain training data for use with the company’s various AI products. Those include Nvidia’s Omniverse 3D world generator, self-driving car systems, and “digital human.”

When those employees asked about the legality of the project, internally named Cosmos, they were assured by management that they had been given clearance by the highest levels of the company to use that content.

data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7

Get your weekly teardown of the tech behind PC gaming

The project sought to build a foundation model, akin to Gemini 1.5, GPT-4, or Llama 3.1, “that encapsulates simulation of light transport, physics, and intelligence in one place to unlock various downstream applications critical to Nvidia.”

To do this, project Cosmos allegedly used an open-source video downloader and employed machine learning to IP hop, thereby avoiding

This is the hidden content, please

’s attempts to block it. According to emails viewed by 404, project managers discussed using as many as 30 virtual machines running on

This is the hidden content, please

Web Services to download 80 years’ worth of full-length and clip-length videos every day.

For its part, Nvidia claims no wrongdoing. “We respect the rights of all content creators and are confident that our models and our research efforts are in full compliance with the letter and the spirit of copyright law,” an Nvidia spokesperson told 404 Media via email. “Copyright law protects particular expressions but not facts, ideas, data, or information. Anyone is free to learn facts, ideas, data, or information from another source and use it to make their own expressions. Fair use also protects the ability to use a work for a transformative purpose, such as model training.”

This is far from the first time that Nvidia (not to mention a vast majority of the rest of the AI field) has taken a “scrape first and maybe ask forgiveness later” approach to its AI training efforts. In July, Nvidia was named in another report on ******** scraping of copyrighted videos alongside Anthropic and Salesforce.

At CES 2024, the company set off an internet firestorm with its ambiguous answers as to how its new generative AI for gaming engine was trained. In response, Nvidia reiterated that its tools were “commercially safe.”

This is the hidden content, please

#Nvidia #reportedly #caught #scraping #data #

This is the hidden content, please

Sign In

Home

Activity

Store

My Details

Forums

All Servers

Nvidia reportedly caught scraping AI data from YouTube again

Recommended Posts

Pelican Press 0

Trader Feedback

Nvidia reportedly caught scraping AI data from

This is the hidden content, please

Sign In

or

Sign Up

again

Link to comment

Share on other sites

Join the conversation

Most Contributions

Vote for the server

Recently Browsing 0 members

Important Information

Sign In

Home

Activity

Store

My Details

Forums

All Servers

Nvidia reportedly caught scraping AI data from YouTube again

Recommended Posts

Pelican Press 0

Nvidia reportedly caught scraping AI data from This is the hidden content, please Sign In or Sign Up again

Link to comment

Share on other sites

Join the conversation

Most Contributions

Vote for the server

Recently Browsing 0 members

Important Information

Nvidia reportedly caught scraping AI data from

This is the hidden content, please

Sign In

or

Sign Up

again