OpenAI Prioritizes Safety for its AI Speech Generator
OpenAI has created Voice Engine, an AI speech generator that can replicate a person’s voice with just an audio clip of 15 seconds.
The text-to-voice tool was designed to sound like the original speaker, making the generated speech remarkably human-like.
However, due to concerns regarding the potential misuse of the new technology, the company decided not to release it to the public just yet.
The US company is deploying this new tool on a small scale, for it to gather insights on its usage and to develop safeguards against mishandling. The tool is currently available to selected partners, such as Age of Learning, HeyGen, and Lifespan. All of them have agreed to usage policies prohibiting impersonation without consent.
OpenAI is also fully aware that the AI speech generator has serious risks associated with generated speech that resembles real people’s voices, especially considering that 2024 is an election year.
The voice-cloning AI technology is not a new development and has been used in troubling scenarios. For instance, leading up to the primary vote in the U.S. in January, voters received AI-generated robocalls impersonating President Joe Biden, advising them to remain at home and refrain from voting.
To prevent the misuse of voice cloning, the US Federal Communications Commission (FCC) has recently banned AI-generated robocalls. This technology, including deepfakes, poses not only risks to elections but also to fraudulent extortion scams.
Despite these risks, Voice Engine has beneficial applications. It can assist individuals with speech impairments by restoring their voice using videos or audio recordings from before they lost the ability to speak. It also offers a more natural-sounding voice for people who struggle to speak, thereby avoiding the artificial sound of conventional speech synthesis.
For accountability purposes, OpenAI is introducing watermarking to track the source of generated audio and is mandating that partners secure explicit and informed consent from the original speaker. They are also promoting voice authentication experiences to confirm the speaker’s identity and advocating for a “no-go” list to block the creation of voices that closely resemble prominent figures.
Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Intelligent Tech sections to stay informed and up-to-date with our daily articles.