shiwaneeg

I am a marketing intern at Valuefirst Digital Media. I write blogs on AI, Machine Learning, Chatbots, Automation etc for House of Bots. ...

Full Bio 
Follow on

I am a marketing intern at Valuefirst Digital Media. I write blogs on AI, Machine Learning, Chatbots, Automation etc for House of Bots.

Why is there so much buzz around Predictive Analytics?
468 days ago

Changing Scenario of Automation over the years
469 days ago

Top 7 trending technologies in 2018
470 days ago

A Beginner's Manual to Data Science & Data Analytics
470 days ago

Artificial Intelligence: A big boon for recruitment?
471 days ago

Top 5 chatbot platforms in India
36309 views

Artificial Intelligence: Real-World Applications
23103 views

Levels of Big Data Maturity
14097 views

Challenges of building intelligent chat bots
13545 views

Chatbots' role in customer retention
13077 views

Baidu's voice cloning AI can swap genders and remove accents

By shiwaneeg |Email | Feb 27, 2018 | 5718 Views

Chinese AI titan Baidu earlier this month announced its Deep Voice AI had learned some new tricks. Not only can it accurately clone an individual voice faster than ever, but now it knows how to make a British man sound like an American woman.

You can insert your own joke here.

The Baidu Deep Voice research team unveiled its novel AI capable of cloning a human voice with just 30 minutes of training material last year.  And since then it's gotten much better at it: Deep Voice can do the same job with just a few seconds worth of audio now.

The team revealed two separate training methods in a recently published white paper. In one of the models a more believable output is generated, but it takes additional audio input. The second model can generate cloned audio much faster but at lower quality.

Both are nominally faster than Baidu's previous attempts with Deep Voice and, according to the researchers, could be upgraded even further with tweaked algorithms and broader datasets. The researchers claim, in a company blog post that in terms of naturalness of the speech and similarity to the original speaker, both demonstrate good performance, even with very few cloning audios.

The purpose of the research is to demonstrate that machines can learn complex tasks with limited datasets, just like people. Imitating voices may be a specific use-case, but it's important for researchers to find ways to minimize footprints through fine-tuning or replacing unwieldy algorithms.

According to the team, humans can learn most new generative tasks from only a few examples, and it has motivated research on few-shot generative models.

Research that furthers the abilities of AI systems while simultaneously reducing the processing power required are what's propelling the field forward.

The world already has Deep Fakes, the controversial AI that can swap one person's face onto another's body - and of course it was used for unwanted purposes. 

Nvidia's AI can generate startlingly realistic photographs of people that don't even exist. We're inching ever closer to a world where you can't believe your own eyes or ears.

Deep Voice isn't perfect, of course, you'll notice the AI's voice sounds a bit robotic. But, let's keep in mind that a year ago this was barely possible at all.

Now, we can't be too far from hearing Kurt Kobain's voice sing new music or learning what Queen Elizabeth would sound like as a male politician from Alabama.

Source: TheNextWeb