NVIDIA’s new AI model Fugatto can create audio from text prompts

Pelican Press · November 25, 2024

This is the hidden content, please

NVIDIA’s new AI model Fugatto can create audio from text prompts

NVIDIA has debuted a new experimental generative AI model, which it describes as “a Swiss Army ****** for sound.” The model called Foundational Generative Audio Transformer Opus 1, or

This is the hidden content, please

, can take commands from text prompts and use them to create audio or to modify existing music, voice and sound files. It was designed by a team of AI researchers from around the world, and NVIDIA says that made the model’s “multi-accent and multilingual capabilities stronger.”

“We wanted to create a model that understands and generates sound like humans do,” said Rafael Valle, one of the researchers behind the project and a manager of applied audio research at NVIDIA. The company listed some possible real-world scenarios wherein Fugatto could be of use in its announcement. Music producers, it suggested, could use the technology to quickly generate a prototype for a song idea, which they can then easily edit to try out different styles, voices and instruments.

People could use it to generate materials for language learnings tools in the voice of their choice. And video game developers could use it to create variations of pre-recorded assets to fit changes in the game based on the players’ choices and actions. In addition, the researchers found that the model can accomplish tasks not part of its pre-training, with some fine-tuning. It could combine instructions that it was trained on separately, such as generating speech that sounds ****** with a specific accent or the sound of birds singing during a thunderstorm. The model can generate sounds that change over time, as well, like the pounding of a rainstorm as it moves across the land.

NVIDIA didn’t say if it will give the public access to Fugatto, but the model isn’t the first generative AI technology that can create sounds out of text prompts. Meta previously released an open source AI kit that can create sounds from text descriptions.

This is the hidden content, please

has its own text-to-music AI called MusicLM that people can access through the company’s

This is the hidden content, please

.

This is the hidden content, please

#NVIDIAs #model #Fugatto #create #audio #text #prompts

This is the hidden content, please

Sign In

Home

Activity

Store

My Details

Forums

All Servers

NVIDIA’s new AI model Fugatto can create audio from text prompts

Recommended Posts

Pelican Press 0

Trader Feedback

NVIDIA’s new AI model Fugatto can create audio from text prompts

Link to comment

Share on other sites

Join the conversation

Most Contributions

Vote for the server

Recently Browsing 0 members

Important Information