Wav2Lip promises to reduce drudgery in video content creation

K V Kurmanath Updated - May 07, 2022 at 02:09 PM.

Talk for a minute and deepfake tech can generate hours of content

Even as fears of deepfake technology posing a threat to reputation worry celebrities and people in power, a Hyderabad-based start-up has developed an artificial intelligence-based solution that promises to revolutionise the way media houses, edtech companies and film production houses create visual content.

All it takes is just a minute’s video footage of a person, talking on any subject and the Wav2Lip solution developed by NeuralSync AI can make a video of that person talking for hours. The technology enables the person in the video to make perfect lip movements for the text that is fed into the system.

“It exactly looks as if that person is talking or reading the text. Why, you can even make the characters in the popular movie Pushpa speak in different (dubbed) languages with perfect lip movement, irrespective of the fact that different dubbing artistes are lending their voices,” Pavan Reddy, Chief Executive Officer of NeuralSync AI, told BusinessLine.

The AI-based solution catches the nuances of how lips are moved to spell words and juxtaposes the same when the newer words or dialogues are fed into the system.

“Unlike similar tools available in the market, it is language and voice independent and works on any face, including CGI faces,” Rudrabha Mukhopadhyay, Founder and CTO of NeuralSync AI, said.

“It started off with Rudrabha’s PhD’s project on the subject at the IIIT-H. We have translated the idea into a commercially viable technology solution. We’re incubated at the institute,” Reddy said.

Feeding the script

The product allows the users to feed a script and select an avatar of their choice, or create their own avatar with different backgrounds and create videos in minutes.

“Alternatively, users can also upload video and audio pair on the platform to generate lip-synced videos,” Reddy said.

Within two months of its soft launch, NeuralSync AI has already signed two anchor clients, including a US media house that is using it to generating news and analysis videos.

“An edtech company in India too is using the technology to generate videos for their students,” he said.

Mukhopadhyay said the technology reduces operating costs and time to produce a video.

“With AI entering video-production, those mismatched lip movements will be a thing of the past,” he said.

After building a solution for the media and edtech sectors, NeuralSync AI is working on a version to address the needs of the film industry.

Pay per use

It charges $3-4 for a simple talking head for a minute. If there are more than one person in a frame, NeuralSync AI will charge $10 a minute. The start-up is in talks with a few Bollywood and Tollywood filmmakers, who have started making eye-popping pan-India movies.

NeuralSync AI is now targeting to further enhance the product and achieve global scale-up by joining an international incubator in the US.

Published on May 6, 2022 11:57