"AI Sun Yanzi" demonstrated the ability of AI voice imitation, Miao Ya Camera showcased the AI image processing skills, and HeyGen presented the multilingual ability of AI through "English cross-talk."
Author: Mu Mu
Have you listened to Guo Degang's new cross-talk? The kind where he speaks English fluently.
Recently, a video of "Guo Degang speaking English cross-talk" went viral on social media platforms. In the video, Guo Degang not only spoke English fluently with accurate pronunciation and natural mouth movements, but also made few grammatical errors.
In fact, this viral video was a creation involving AI technology. This "authentic translation without a translated tone" was highly praised by netizens, with many feeling that even human voice dubbing could not achieve such lifelike effects.
Upon further investigation, it was discovered that behind this popular video is a Chinese company called Shi Yun Technology. Their product HeyGen, which translates Guo Degang's cross-talk into English, is actually a strategic move for their video production business using AI translation and voice imitation.
If "AI Sun Yanzi" demonstrated the ability of AI voice imitation, Miao Ya Camera showcased AI image processing skills, then HeyGen presented the multilingual ability of AI through "English cross-talk."
Previously showcased as high-end, artificial intelligence is now entering the public's view in ways that appeal to the masses. After the entertainment, the topic of "AI Sun Yanzi" cooled down, Miao Ya Camera faded away due to its non-high-frequency and non-essential nature, so will HeyGen follow suit? Can its appearance really address the pain points of video production?
AI Voice Imitation Evolves to Speak Foreign Languages
In October of this year, the video of "Guo Degang speaking English cross-talk" went viral across the internet, with millions of views on Bilibili, quickly leading to content creators making videos of celebrities speaking foreign languages.
As a result, Guo Degang not only spoke English cross-talk, but also conducted an English interview with Ben Shan, making the interview program more international. "Yu Qian" can now even rap in English, while "Taylor Swift" and "Emma Watson" can fluently converse in Chinese during interview programs.
This is not simply adding foreign language subtitles or dubbing, but actually enabling the characters to speak fluent foreign languages, with voices and mouth movements matching perfectly. Such videos have also gained popularity on overseas video platforms.
The viral translated dubbing videos were made possible by the AI tool HeyGen, showcasing its ability in language translation and once again demonstrating AI voice imitation, earning high praise from netizens.
After waiting in line for 7000 videos, netizen @Gorden Sun uploaded a raw material to HeyGen and created a video of Taylor Swift speaking Chinese, stating that "the effect is absolutely the best, without a doubt," but also mentioning that "the voice cloning has some flaws" and "the emotional restoration is somewhat lacking." From his experience, the flaws are minor compared to the overall effect.
Netizens queued for 7000 videos to generate a video of Taylor Swift speaking Chinese
With the help of HeyGen, users only need to upload a video, select the language, and the tool automatically translates, adjusts the tone, and generates a foreign language video with matching mouth movements.
Soon, a large number of interesting videos created using AI translation and voice imitation emerged, with many videos reaching millions of views. HeyGen also gained a surge in traffic, with tens of thousands of videos queued up at its peak, taking longer than the time it took for people to use Miao Ya Camera to generate portrait photos.
It is worth noting that HeyGen is developed by a Chinese company called Shi Yun Technology, established in November 2020. According to the company's official website, their products include AI translation and voice imitation, AI digital avatar generation, and AI script generation services.
Tianyancha data shows that Shi Yun Technology has completed two rounds of financing in the millions of dollars. In March 2021, Shi Yun Technology received angel investment from Sequoia China Seed Fund and ZhenFund, followed by another round of pre-Series A financing in August of the same year, led by IDG Capital, with follow-on investments from Sequoia China and ZhenFund.
HeyGen's goal is to become a mid-journey player in the AI video creation field. Currently, the team behind HeyGen consists of approximately 30 people. Although HeyGen has not yet reached the user base of a mid-journey player, it has successfully become the latest popular AI application in the domestic market following "Miao Ya Camera."
According to statistics from a netizen on the social platform X, the visitation of major AI websites for image and video creation began to decline in August and September, but HeyGen's visitation increased against the trend, rising by as much as 92%.
HeyGen's visitation increased against the trend
Founder Joshua Xu revealed that after the official launch of HeyGen, the product achieved an annual recurring revenue (ARR) of $1 million within 7 months, maintaining a consecutive 50% month-over-month growth for 9 months.
Launching a Paid Version: Does It Address the Pain Points of Video Creators?
With the continuous increase in visitation, can HeyGen maintain its momentum? This depends on whether it can address the pain points of video production.
The AI-generated portrait tool "Miao Ya Camera" was once hailed as "able to beat hippocampus." However, now that the hippocampus is thriving, the mini-program version of Miao Ya Camera has experienced a sharp decline in traffic indicators after a brief peak of about 3 months. According to WeChat Index, the index trend of "Miao Ya Camera" has returned to the level before its popularity.
"Miao Ya Camera" experienced a short-lived popularity
The short-lived popularity of Miao Ya Camera, which gained attention for its "portrait" feature, was due to the fact that portrait photos for consumer use are not a high-frequency and essential scenario for the general public. Despite the need for payment, Miao Ya Camera failed to create more functions beyond portrait photos, significantly reducing users' willingness to pay and leading to its inevitable fate of being discarded after use.
HeyGen gained popularity once again as the public rediscovered the highlights of AI in short video entertainment, thus entering the toolkit of video creators. But can this tool really address the pain points of video creators?
On platforms such as Zhihu and Douyin, many video bloggers have shared the real pain points of video production. Behind the viral videos are high-cost inputs in script creation, shooting, and post-production editing. While AI productivity can solve cost issues, creativity still requires human input.
Currently, HeyGen mainly provides four functions, allowing users to create various types of videos using AI video tools for purposes such as product marketing, content marketing, sales promotion, and learning and training. Users can use the platform's built-in digital avatars, real avatars, or AI-drawn avatars to make characters speak different languages. Currently, HeyGen supports over 40 languages.
It is evident that HeyGen is striving to guide the product's expansion into various video creation applications, but it seems that it is not aimed at addressing the pain points of video creators, but rather utilizing AI's voice imitation and translation capabilities to facilitate the cross-border and cross-regional dissemination of video content.
Currently, HeyGen has launched both a free version and a paid version. The cheapest paid version requires a monthly fee of $24 and will gradually open up API access, team collaboration, and enterprise features. The free version is limited to generating 1-minute videos and requires a long wait in the queue for generation.
It is clear that HeyGen's main source of revenue comes from the B-side. At the end of October, the commercial version was launched, with new features including the ability to generate content up to 3 hours in length, improved image quality up to 4K, assistance in creating PPTs, text-to-video conversion, support for audio uploads, and video sharing. The commercial version of HeyGen can meet various needs in industries such as advertising, e-commerce, and news.
Even after the upgrade, HeyGen still focuses on scenarios and avoids addressing the essential needs of video creators in the creative process.
In the video production scene, there are still many AI tools, almost all of which are aimed at the production process. For example, Pictory.AI can directly convert scripts into videos and achieve AI voice matching with materials and music. Applications such as Tencent Zhiying, Yiframes, and Wancai Weiyin use AI technology to simplify the video creation process and provide functions such as text dubbing, article-to-video conversion, and digital avatar narration.
However, all AI tools for video production cannot avoid copyright issues, which is one of the most daunting challenges for video creators. Even HeyGen, which evolved from AI voice imitation to translation, cannot solve the copyright problem, leaving this difficult issue for video creators to handle.
Currently, HeyGen is widely used for secondary creation in short videos, such as AI voice swapping. In response, some lawyers have stated that using AI technology to replace someone else's voice, create "translations," and publish videos may potentially infringe on copyright, portrait rights, and voice rights. For example, cross-talk and sketches are protected "works" under the Copyright Law of the People's Republic of China. If netizens use AI software to "translate" cross-talk and sketches into other languages, they need authorization from the copyright owner, otherwise there may be infringement issues.
In addition, if netizens use someone else's image to create videos and publish them on websites, they need to obtain the consent of the person's portrait rights, otherwise it may constitute infringement. Finally, regarding voice rights, according to the Civil Code of the People's Republic of China, the protection of a natural person's voice is subject to the relevant provisions of portrait rights protection. In other words, the consent of the voice rights holder is required to use someone else's voice.
From the end of last year to the present, the AI magic box opened by ChatGPT continues to demonstrate new magic. It seems that humans have obtained a ticket to artificial intelligence, but it may still be a long wait before smoothly boarding this high-speed train of productivity improvement.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。