Intelligent Tech

Deepmind’s V2A Creates Video Soundtracks and Dialogue

by Inside Telecom Staff - June 20, 2024
Reading time: 2 min

Post Views: 412

DeepMind, Google’s AI research lab is developing a new technology that generates soundtracks for video to audio.

In a recent blog post, DeepMind stated that the video to audio (V2A) technology is a crucial component in the field of AI-generated media.

Although many companies including DeepMind have created similar video-generated models, they lack the ability to produce synchronized sound effects.

“Video generation models are advancing at an incredible pace, but many current systems can only generate silent output,” DeepMind writes, “V2A technology [could] become a promising approach for bringing generated movies to life.”

How Does V2A Work?

V2A technology operates by using a description of a soundtrack, along with a video in order for it to create the appropriate matching music, sound effects, and dialogue. This process is enhanced with the use of SynthID technology developed by the AI research company to fight deepfakes.

According to the company, the AI model behind V2A is a diffusion model trained on different sounds, dialogue transcripts, as well as video clips.

DeepMind did not disclose any information on whether the training data was copyrighted or if the creators gave their consent.

This technology is not something new in the market, as recently Stability AI released a similar tool. There are also tools that create sound effects. Yet, what makes V2A different is that it has the ability to understand raw video pixels and automatically sync generated sounds with the video, even without descriptions.

Between Innovation and Responsibility

Despite its potential, the company acknowledges that this new technology has limitations. The model struggles with the video to audio that contain artifacts and twists, leading to a lower quality in audio. Additionally, the generated sounds can sometimes be conventional and unconvincing.

Due to these limitations, and to avoid its misuse, DeepMind is not planning to release the V2A anytime soon. It is now gathering feedback from creators and filmmakers to further enhance it, stating that it will also undergo safety assessments and testing.

Indeed, the AI research company believes that V2A will be mostly useful for the archivists, as well as those who work with historical footage. However, and in a remarkable move, it admits also that such AI tools have the potential to take over the film and TV industry, threatening the livelihood of employees. Thus, the company will make sure to take strong work protections to ensure that such tools do not replace or even eliminate some professions.

Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Intelligent Tech sections to stay informed and up-to-date with our daily articles.

Tags: Artificial Intelligence China Google Google AI Google Deepmind Inside Telecom News Technology U.S.

The 2026 CIMP AutoEcosystems Expo—A World Class Automotive Industry Ecosystem Trade Platform in Asia

How AI-Driven Visual Technologies Influence Decision-Making in Physical Spaces

DWDM for AI at Scale: Building a DWDM Network for GPU Cluster Transport

Announcing DSI’s 3rd Annual FutureG for Defense and Warfare Summit

Navigating the Frustration: Apple Pay’s Delay and the Shifting Digital Payment Landscape

Starlink’s Path to Gigabit Satellite Internet with Gen2, Gen3 Satellites

Fiber, Cable, 5G Vie to Power Next-Gen Industrial Connectivity

Wi-Fi 8 Taking Connectivity to New Levels Starting 2028

Meta’s Under Sea Internet Cables Will Keep Us Connected

Is Ericsson’s 5G Uplink Speed Worth the Cybersecurity Risk?

Hackers Using AI to Automate the Very Tools Meant to Stop Them

Europe’s Rail Innovation Through Labs and Test Trains

AI Future’s Buried Under 25-feet of Greenlandic Ice. And the EU and US Know It.

Musk Tells Davos AI Will Be Smarter than Humanity by Early 2027

OpenAI’s Trading Silicon Valley Talent for Middle Eastern Capital

MyMonty: The New Era of Banking

Entering the Monty Multiverse at Seamless 2023

Seamless Dubai 2023 - From Concept to Reality: Shaffra Technologies Opens Doors to Metaverse Mastery

Take A Look in the Mirror. The Greatest Technology of All Will Stare Back at You

Monty Mobile Enters Multibillion-Dollar MNO Equipment Industry

Nvidia-Powered Robot Bartender Pours into Hospitality

Are We Addicted to Social Media? IG, TikTok Trigger Physical and Emotional Withdrawal

Meta's AI on Instagram, Facebook Helps Save Lives

US DoT’s New Safety Plan Introduces Car Communication

Little Girl Receives First Prosthetic Eye from MRI, CT Scans

Deepmind’s V2A Creates Video Soundtracks and Dialogue

Between Innovation and Responsibility