Charts
DataOn-chain
VIP
Market Cap
API
Rankings
CoinOSNew
CoinClaw🦞
Language
  • 简体中文
  • 繁体中文
  • English
Leader in global market data applications, committed to providing valuable information more efficiently.

Features

  • Real-time Data
  • Special Features
  • AI Grid

Services

  • News
  • Open Data(API)
  • Institutional Services

Downloads

  • Desktop
  • Android
  • iOS

Contact Us

  • Chat Room
  • Business Email
  • Official Email
  • Official Verification

Join Community

  • Telegram
  • Twitter
  • Discord

© Copyright 2013-2026. All rights reserved.

简体繁體English
|Legacy

In addition to deepfake and voice dubbing simulation, AI software that matches lip movements has also emerged.

CN
巴比特
Follow
2 years ago
AI summarizes in 5 seconds.

Source: GenAI New World

Image Source: Generated by Wujie AI

According to overseas media reports, last week a translation software called LipDub was officially released, which allows video creators to communicate in different languages within minutes using this AI program.

LipDub was developed by the startup company Captions, founded in 2021 by Gaurav Misra and Dwight Churchill. Captions has received investment support from Sequoia Capital, Andreessen Horowitz, Instagram co-founders Kevin Systrom and Mike Krieger, and former Facebook VP of Product Design Julie Zhuo.

Founder Gaurav Misra, originally from New Delhi, India, was previously the Design Engineering Lead at Snap. Misra stated that he grew up surrounded by various languages such as Hindi, English, Punjabi, and Urdu. He also spent years learning French, which helped him establish professional relationships in Europe, Africa, and the Middle East.

Misra believes that AI-driven translation and lip-syncing technology can help people connect and understand each other more easily.

Captions: Easily Achieve Video Translation Localization with AI


Captions is known for its AI-generated subtitles, speech correction, and technology to correct the eye position of video creators in post-production. Misra and Churchill previously worked as product developers at Goldman Sachs and had long wanted to incorporate lip-syncing into dubbing translations, but they didn't expect it to happen so quickly. Misra said, "We initially thought this technology would take 10 years to achieve, but the pace of technological development is so fast now, with new things appearing almost every month or even every week."

LipDub is entering a promising AI translation market. Its competitors include voice cloning translation apps HeyGen and Verbalate, as well as new tools launched by companies such as Spotify and visual effects studio Monsters Aliens Robots Zombies.

In the past, many companies needed to hire multiple video hosts to express the same story in different languages, but now they can achieve the same function through generative AI. These applications allow users to upload videos and then convert them into fluent Turkish, French, Arabic, or Italian within minutes.

Rijul Gupta, founder of DeepMedia, stated, "We have basically perfected this new technology, where anyone can clone anyone's voice and make it speak in different languages with a 5-second audio reference."

On Spotify and Reddit, voice-over videos of some celebrities have already appeared thousands of times. Last month, Spotify also joined the trend, announcing that they will provide AI-translated podcasts, allowing these podcasts to maintain their original tone and intonation while being translated into different languages.

Currently, actors Dax Shepard and Kristen Bell, MIT researcher Lex Fridman, and Steven Bartlett have Spanish podcasts, and French and German translations will be available soon. The new Spotify tool utilizes OpenAI's latest speech generation technology to provide a more realistic auditory experience.

At the beginning of last year, Misra and the Captions team began experimenting with lip-syncing technology and tested its effectiveness in the Captions application with partners.

Misra admitted that the development of lip-syncing technology was faster than he had expected. "It looks like a natural progression into the next stage, creating a kind of video that doesn't look like it's dubbed or artificially adjusted. The new technology makes the video look very natural and easy to understand." From the beginning of testing, a new possibility has been demonstrated to them. Misra said, "It's like the technology we saw in 'Star Trek' before, it's simply science fiction!"

Captions received a Series B funding of $25 million led by Kleiner Perkins in June this year. Currently, Captions has 100,000 daily active users, and Misra believes that the company will have more active users after the launch of LipDub.

Currently, LipDub supports 28 languages, including Korean, Spanish, Czech, Tamil, and Ukrainian. It uses a zero-shot mode, allowing Captions' video generation model to produce fluent videos without seeing the subject.

LipDub's internal machine learning algorithm has been trained to recognize the speaker's lip movements, and the company also uses OpenAI's GPT-4 model to translate videos into different languages and dialects in the application. This AI dubbing technology has been used in the Captions application and was released in March this year, attracting users from around the world.

Misra said, "People who were originally inaccessible to specific audiences can now do it. This technology is a perfect example of a utopian future, so I am very excited about it."

Misra believes that the possibilities of new technology are endless. "I think live streaming is a very reliable case." Misra said, "Whether it's game live streaming on Twitch or unknown speeches, these types of content can easily be localized through AI."

HeyGen: Making Video Spread in Different Languages as Easy as Typing

In addition to Captions, there are many other AI translation companies, such as HeyGen. HeyGen is an AI company with millions of users and is one of the largest participants in the field of AI lip-syncing and translation for short video content. After launching video translation functionality on September 7, the company quickly gained popularity on X. Since then, dozens of realistic videos have gone viral online, with users sharing clips of Elon Musk, Messi, and Mark Zuckerberg speaking in multiple foreign languages.

Mark Burginger, head of the toy company Qubits, once promoted his STEM-centered company on a show called "Shark Tank." Out of curiosity, he tried HeyGen's AI translation feature on September 13. He posted a video of himself speaking in Spanish on X, even though he doesn't speak Spanish.

"Can you imagine a small toy company with an annual income of less than a million dollars being able to use these relatively inexpensive tools?" Burginger said, he is an artist and inventor, working in Hendersonville, North Carolina, and Burginger said, "This helps to compete fairly with large companies."

HeyGen's goal is to "eliminate language barriers," said Joshua Xu, co-founder and CEO of the company: "We envision a future where creating video content and spreading information in different languages is as easy as typing."

In a video generated by AI posted on X, Xu added that educational platforms such as Coursera, Khan Academy, and MasterClass can expand their influence through "multilingualism." HeyGen currently supports 10 input languages and 8 output languages, including English, Spanish, Chinese, Italian, Hindi, and Japanese, among others.

Joshua Xu, CEO of HeyGen

Before founding HeyGen, originally named Movio, Xu from Snap and former ByteDance engineer Wayne Liang founded Surreal in 2020.

At that time, Surreal offered realistic "deepfake" products, a video synthesis technology that can create lifelike synthetic videos. This technology attracted e-commerce companies hoping to promote products more effectively. After operating in Shenzhen, China for four months, Surreal received a $1 million investment in an angel round. Today, Surreal is still active in China, posting job and internship information on Chinese employment and university websites, but the HeyGen platform is mainly operated in Los Angeles, where Xu and Liang work.

Movio, an AI video platform based on the Surreal engine, was launched in July 2022. According to the company, its product generated $1 million in revenue within 7 months. Xu and Liang then renamed Movio to HeyGen, and since 2020, HeyGen and Surreal have received at least $9 million in funding from Sequoia Capital, IDG Capital, ZhenFund, and Baidu Ventures.

Inspired by Podcasts, Verbalate was Born

In addition to LipDub and HeyGen, another platform, Verbalate, also enters this field and can dub users' videos into the target language. The difference is that Verbalate can dub videos up to 30 minutes long.

According to the platform's founder Grant Davies, Verbalate was purely born out of boredom during the pandemic. One day in 2022, Davies was listening to a podcast interview with Joe Rogan and MrBeast while riding a bike. At that time, the YouTuber mentioned that his channel was using voice actors to dub videos into languages such as Spanish, Russian, Hindi, and Portuguese, because less than 10% of the world speaks English. Davies was researching AI technology at the time, and they felt that their team could definitely achieve this without much difficulty.

Davies used his marketing network to introduce and sell Verbalate's services to business clients who wanted to communicate with overseas employees. Dom Procter, founder of the Sydney-based outsourcing company OutSourced Staff, said, "It made my life easier as a sales and marketing person."

He had used Verbalate videos to send messages to remote employees in Asia or Eastern Europe, and Dom Procter pointed out, "Creating content in their native language changed the game." Verbalate's basic subscription plan costs $9 per month, allowing users to create a 10-minute video, with an additional cost of $1 per minute. In contrast, HeyGen's creator package costs $29 per month, allowing the creation of multiple videos, each up to 5 minutes.

Other platforms are targeting larger markets and longer video playback times. Based in Toronto, MARZ primarily attracts film and television production companies interested in realistic dubbing through its LipDub AI platform (not Captions' LipDub).

LipDub AI currently processes video clips with a runtime of less than 20 minutes containing multiple shots. Although the company currently uses training clips to create these dubs, it hopes to speed up processing by abandoning training clips and relying solely on audio and original clips within the year. Unlike other dubbing platforms, LipDub AI does not use large language models but uses its own generative model trained on recordings.

Tim Reyes, Market Director at MARZ, believes that lip-syncing technology will help filmmakers expand the influence of movies or TV shows without jeopardizing the job security of actors. Reyes said, "LipDub AI has actually opened up a bunch of opportunities for a new market, unlike some other AI technologies that disrupt the current workflow in the film industry."

In addition to opening up new markets, the creators of these applications also have loftier ideals. Davies hopes that translation programs like Verbalate can break people's implicit biases about their own language and even foster a more globalized way of thinking. Davies said that in a video shared by his team on X, people from different regions can be seen expressing their views in different languages, helping him think about how people can communicate across borders. Davies believes this could make people more humane, as people from different cultural backgrounds can better understand each other.

Davies said that even political information has a different effect when presented in your own language, and if we can listen to each other, it could help humanity.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

复活节狂欢,瓜分1万USDT!
广告
|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

Selected Articles by 巴比特

1 year ago
Baidu AI, needs to make money through Killer App.
1 year ago
Global AICoin Music Concert, the first time hearing the voice of China
2 years ago
These five women are changing the AI industry
View More

Table of Contents

|
|
APP
Windows
Mac
Share To

X

Telegram

Facebook

Reddit

CopyLink

Related Articles

avatar
avatarOdaily星球日报
4 hours ago
Gate Institutional Weekly Report: BTC Funding Rate Turns Positive, CEX TradFi Trading Volume Soars (March 23, 2026 - March 29, 2026)
avatar
avatarOdaily星球日报
5 hours ago
CoinGlass: 2026 Q1 Cryptocurrency Market Share Research Report
avatar
avatar律动BlockBeats
5 hours ago
CoinGlass: 2026 Q1 Cryptocurrency Market Share Research Report
avatar
avatar律动BlockBeats
6 hours ago
BIT officially launches "Same Name Virtual Account": Kicking off a new era of convenient, efficient, and compliant over-the-counter trading.
avatar
avatar深潮TechFlow
6 hours ago
Native Account Abstraction + Quantum Threat Resistance: Why Has EIP-8141 Not Yet Become the Highlight of Ethereum Hegotá?
APP
Windows
Mac

X

Telegram

Facebook

Reddit

CopyLink