OpenAI says Voice Engine might be too risky to release

OpenAI says it ran a small-scale test of its new voice cloning product Voice Engine with a few select partners. The results show promising applications for the tech, but safety concerns may keep it from being released.

OpenAI says that Voice Engine can clone a human’s voice based on a single 15-second recording of their voice. The tool can then generate “natural-sounding speech that closely resembles the original speaker.”

Once cloned, Voice Engine can turn text inputs into audible speech using “emotive and realistic voices.” The tool’s capability makes exciting applications possible but raises serious safety issues too.

Promising use cases

OpenAI started testing Voice Engine late last year to see how a small group of select participants could use the tech.

Some of the examples of how Voice Engine test partners used the product are:

Adaptive teaching – Age of Learning used Voice Engine to provide reading assistance to children, create voice-over content for learning material, and provide personalized verbal responses to interact with students.
Translating content – HeyGen used Voice Engine for video translation so product marketing and sales demos could reach a wider market. The translated audio retains the person’s native accent. So, when a native French speaker’s audio is translated into English you’d still hear their French accent.
Provide wider social services – Dimagi trains health workers in remote settings. It used Voice Engine to give training and interactive feedback to health workers in underserved languages.
Supporting non-verbal people – Livox enables non-verbal people to communicate using alternative communication devices. Voice Engine allows these people to choose a voice that best represents them rather than something that sounds more robotic.
Helping patients recover their voice – Lifespan piloted a program offering Voice Engine to people with speech impairments due to cancer or neurologic conditions.

Voice Engine isn’t the first AI voice cloning tool, but the samples in OpenAI’s blog post point to it representing the state-of-the-art and may even be better than ElevenLabs.

Here’s just one example of the natural inflection and emotive characteristics it can generate.

OpenAI just launched Voice Engine,
It uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.
Reference and Generated audio is very close and hard to differentiate.
More details in pic.twitter.com/tJRrCO2WZP

— AshutoshShrivastava (@ai_for_success) March 29, 2024

Safety concerns

OpenAI said it was impressed with the use cases test participants came up with but more safety measures would need to be in place before the company decided on “whether and how to deploy this technology at scale.”

OpenAI says technology that can accurately reproduce someone’s voice “has serious risks, which are especially top of mind in an election year.” Fake Biden robocalls and the fake video of Senate candidate Kari Lake are cases in point.

In addition to the clear restrictions in its general usage policies, the participants in the trial had to have “explicit and informed consent from the original speaker” and were not allowed to build a product that enabled people to create their own voices.

OpenAI says it implemented other safety measures including an audio watermark. It didn’t explain exactly how but said it could perform “proactive monitoring” of Voice Engine’s use.

What’s next?

Will the rest of us get to play around with Voice Engine? It’s unlikely, and maybe that’s a good thing. The potential for malicious use is huge.

OpenAI is already recommending that institutions like banks phase out voice authentication as a security measure.

Voice Engine has an embedded audio watermark, but OpenAI says more work is needed to identify when audiovisual content is AI-generated.

Even if OpenAI decides not to release Voice Engine, others will. The days of being able to trust your eyes and ears are history.

The post OpenAI says Voice Engine might be too risky to release appeared first on DailyAI.

Google’s Gecko benchmark identifies best AI image generator

aibots.dk
April 30, 2024
0

Google’s DeepMind released Gecko, a new benchmark for comprehensively evaluating AI text-to-image (T2I) models. Over the last two years, we’ve seen AI image generators like DALL-E and Midjourney become progressively better with each version release. However, deciding which of the underlying models these platforms use is best has been largely subjective and difficult to benchmark. To make a broad claim that one model is “better” than another isn’t so simple. Different models excel in various aspects of image generation. One may be good at text rendering while another may be better at object interaction. A key challenge that T2I models

The post Google’s Gecko benchmark identifies best AI image generator appeared first on DailyAI.

Study finds brain reacts differently to human and AI voices

aibots.dk
July 1, 2024
0

A new study shows that while humans struggle to distinguish human and AI voices, our brains respond differently when we hear them. As AI voice cloning becomes more advanced, it raises ethical and safety concerns that humans weren’t exposed to before. Does the voice on the other end of the phone call belong to a human, or was it generated by AI? Do you think you’d be able to tell? Researchers from the Department of Psychology at the University of Oslo tested 43 people to see if they could distinguish human voices from those that were AI-generated. The participants were

The post Study finds brain reacts differently to human and AI voices appeared first on DailyAI.

Adam Famularo, CEO at WorkFusion — Leadership, AI Digital Workers, GenAI Challenges, AI Evolution, Risk Mitigation, Scaling AI, Human Oversight, AI in Education, Expanding AI, and Business Trends.

aibots.dk
September 30, 2024
0

In this interview, Adam Famularo, CEO at WorkFusion, delves into how his leadership background and focus on innovation have propelled the company’s pioneering role in AI Digital Workers. Adam discusses the impact of AI in combating financial crime, the integration of generative AI, and strategies for scaling in highly regulated sectors. He also provides a […]

OpenAI says Voice Engine might be too risky to release

Promising use cases

Safety concerns

What’s next?

AI Predicting Natural Disasters: Can We Trust the Warnings?

MIT Schwarzman College of Computing launches postdoctoral program to advance AI across disciplines

Rethinking Financial Architecture: How AI is Forcing a $3.1 Trillion Industry Transformation

Top 3 Text-to-Image AI Generators for 2025: Features, Tools, and Tips

Would You Eat a Meal Cooked by a Robot?

Promising use cases

Safety concerns

What’s next?

Related Posts