OpenAI's Johansson gaffe pushes voice cloning into spotlight
OpenAI was forced to apologise to actor Scarlett Johansson last week for using her voice –- or something very similar –- on its latest chatbot, throwing the spotlight on to voice-cloning tech.
Although OpenAI denied the voice they used was Johansson's, their case was not helped by CEO Sam Altman flagging the new model with a one-word message on social media -- "Her".
Johansson voiced an AI character in the film "Her", which Altman has previously said is his favourite film about the technology.
Right from the start, AI voice cloning has proved problematic.
Last year, British firm Elevenlabs went viral for all the wrong reasons when it released its voice-cloning software.
Internet pranksters immediately began pushing out deepfaked celebrities -- Harry Potter star Emma Watson was shown reading Hitler's Mein Kampf.
Law enforcement warned that AI clones could be used to extort money from loved ones over the phone.
The technology has developed rapidly in the past year becoming far more realistic and nuanced.
Danish entrepreneur Victor Riparbelli, CEO of British AI firm Synthesia, told AFP it was largely down to a program called Tortoise that was released two years ago.
The program's developers threw thousands of hours of voice data into their model in an unstructured way and discovered it not only learnt what to say but how to say it.
"That was a pretty big paradigm shift," Riparbelli said on the sidelines of last week's VivaTech conference in Paris.
Tortoise was an open source program and Elevenlabs was the first to go to market using it.
OpenAI uses similar systems though they do not release any details.
- 'Not very good' -
Much of the controversy around voice cloning has focused on concerns over people misusing the software.
But the claim against OpenAI is unusual because it is the company itself accused of playing fast and loose.
"It was very unfortunate that OpenAI did that -- really not very good," Katya Laine, CEO of TALKR.ai, told AFP at VivaTech.
"If they actually cloned her voice without her knowing then I think that's very very bad," said Riparbelli.
The two entrepreneurs are among hundreds harnessing AI voice programs for uses that they argue will make companies more efficient.
Laine's firm provides virtual voice assistants -- essentially AI customer service agents.
She said her firm's system could now resolve 25 to 30 percent of calls without any human involvement.
Synthesia specialises in video avatars, which Riparbelli said allowed any office worker to turn text or slides into a video performed by a realistic AI.
Both Riparbelli and Laine allow their clients to use their own avatars, off-the-shelf products or those supplied by the likes of OpenAI and Elevenlabs.
Riparbelli said Synthesia used actors whose likenesses and voices were licensed for two years with an option to renew after the initial period.
The problems arise if actors' voices are used without their consent.
- 'Odd precedent' -
The fiasco overshadowed a developer conference in Paris last week when OpenAI was showing off a suite of new tools.
In front of a big screen in an auditorium, Romain Huet, OpenAI's Head of Developer Experience, breezily chatted into his phone.
Seconds later, his short voice sample had been processed and could be heard commentating over a generated video -- in five languages.
The demonstration showed how quickly the field is moving, but the headlines had already been written.
The Washington Post asked in a newsletter "How dumb is OpenAI?", other commentators were suggesting wunderkind Altman was nothing more than a huckster.
Nonetheless, Riparbelli was open to OpenAI's argument that they had used another actor who just sounded liked Johansson.
"If it's not her but someone who sounds a lot like her... where do you draw that line," Riparbelli asked.
"If they’re not allowed to use someone who sounds a lot like her, then it sets a very odd precedent."
D.Nelson--RTC