Jump to content
  • Sign Up
×
×
  • Create New...

OpenAI and Google reportedly used transcriptions of YouTube videos to train their AI models


Recommended Posts

  • Diamond Member



OpenAI and
This is the hidden content, please
reportedly used transcriptions of
This is the hidden content, please
videos to train their AI models

OpenAI and

This is the hidden content, please
trained their AI models on text transcribed from
This is the hidden content, please
videos, potentially violating creators’ copyrights, according to
This is the hidden content, please
. The report, which describes the lengths OpenAI,
This is the hidden content, please
and Meta have gone to in order to maximize the amount of data they can feed to their AIs, cites numerous people with knowledge of the companies’ practices. It comes just days after
This is the hidden content, please
CEO Neal Mohan said in an interview with
This is the hidden content, please
that OpenAI’s alleged use of
This is the hidden content, please
videos to train its new text-to-video generator, Sora, .

According to the NYT, OpenAI used its Whisper speech recognition tool to transcribe more than one million hours of

This is the hidden content, please
videos, which were then used to train GPT-4.
This is the hidden content, please
previously reported that OpenAI had used
This is the hidden content, please
videos and podcasts to train the two AI systems. OpenAI president Greg Brockman was reportedly among the people on this team. Per
This is the hidden content, please
’s rules, “unauthorized scraping or downloading of
This is the hidden content, please
content” is not allowed, Matt Bryant, a spokesperson for
This is the hidden content, please
, told NYT, also saying that the company was unaware of any such use by OpenAI.

The report, however, claims there were people at

This is the hidden content, please
who knew but did not take action against OpenAI because
This is the hidden content, please
was using
This is the hidden content, please
videos to train its own AI models.
This is the hidden content, please
told NYT it only does so with videos from creators who have agreed to take part in an experimental program. Engadget has reached out to
This is the hidden content, please
and OpenAI for comment.

The NYT report also claims

This is the hidden content, please
tweaked its privacy policy in June 2022 to more broadly cover its use of publicly available content, including
This is the hidden content, please
Docs and
This is the hidden content, please
Sheets, to train its AI models and products. Bryant told NYT that this is only done with the permission of users who opt into
This is the hidden content, please
’s experimental features, and that the company “did not start training on additional types of data based on this language change.”





This is the hidden content, please

news, gear,

This is the hidden content, please
, openai,
This is the hidden content, please

#OpenAI #
This is the hidden content, please
#reportedly #transcriptions #
This is the hidden content, please
#videos #train #models

This is the hidden content, please

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Vote for the server

    To vote for this server you must login.

    Jim Carrey Flirting GIF

  • Recently Browsing   0 members

    • No registered users viewing this page.

Important Information

Privacy Notice: We utilize cookies to optimize your browsing experience and analyze website traffic. By consenting, you acknowledge and agree to our Cookie Policy, ensuring your privacy preferences are respected.