Four major models for making money, who will win?

CN
巴比特
Follow
1 year ago

Source: Whale Selection Society Pro

Author: Yang Xiaohe

Image source: Generated by Wujie AI

In 2023, it's possible that all investors are asking the companies they invest in, "Can we build a large model?"

No internet company wants to miss out on this wave of large model craze. On November 30th, as ChatGPT celebrated its first anniversary, according to Mr. Li Yanhong, CEO of Baidu, there are currently more than 230 large models emerging in China. The most direct attraction is the skyrocketing valuation of OpenAI to over 90 billion US dollars.

Every company is eager to try, whether it's a hundred billion or a trillion scale parameters, self-developed or shell technology, perhaps none of these are so important, but it's very necessary to create a large model and hold a press conference.

Investors are urging, industry peers are paying attention, and customers/users are expecting. This CEO's top priority project has been lively from the beginning of the year to the end of the year. Despite Mr. Li Yanhong's repeated speeches that building large models is a huge waste of resources, it didn't stop the number of domestic large models from tripling from the middle of the year to the end of the year.

Among them, the earliest large models such as ChatGLM by Wenzin, as well as a series of key projects like Hanyuan/Dou Bao, have attracted public attention. However, as public attention fades, a practical problem is placed in front of the companies: how to make money using large models?

"Balancing the relationship between technology and commercialization is also a question that investors have been asking." Zhang Peng, CEO of Zhipu AI, recently expressed, "We have always insisted on walking on two legs, because from the first day our company was established, we had customers and revenue, which may not be enough to cover all our costs compared to our investment."

In fact, not only start-up companies, even large companies that have already gone public are also considering how to survive in the long-term investment in large models. "Whale Selection Society Pro" has summarized four major commercialization options based on the choices of large enterprises in the market. Objectively speaking, during the chaotic period of the large model market, various commercial explorations are very beneficial.

First faction: "Self-developed faction" vs "benchmarkers", ecology is the key

Former Alibaba CEO Zhang Yong once pointed out that the development of large models with trillions of parameters is a comprehensive competition of "AI + cloud computing", covering many fields such as algorithms, massive underlying computing power, networks, big data, and machine learning, which is a complex systematic engineering.

Therefore, only super-level players like BAT in China, through self-developed large models with trillions or even tens of trillions of parameters, then establish the "AI + cloud computing" ecosystem and output MaaS (platform as a service).

In March, Baidu launched the Wenxin Qianfan platform, which is embedded in the Baidu Intelligent Cloud ecosystem. Wenxin Yiyuan is one of the large models on the Wenxin Qianfan platform. Previously, data showed that 45 third-party models had been integrated into the Wenxin Qianfan platform. Baidu's goal is to attract more enterprises to develop in the Wenxin ecosystem. Even if they do not call the Wenxin large model API, they can still use Baidu Intelligent Cloud services. Baidu can generate revenue through these underlying service capabilities.

Therefore, Baidu also set up bonuses and organized the Wenxin Ecological Cup Entrepreneurship Competition. Start-up companies like BuySmart jointly won tens of millions of bonuses.

According to the public data released by Baidu Wenxin, as of August this year, PaddlePaddle and Wenxin ecosystem have had 8 million developers. Since Wenxin Yiyuan opened its services to the public on August 31, the user base has exceeded 70 million. API is the main way for native AI applications to call large models, and its usage is more than the total usage of more than 200 large models in China.

Another large model ecosystem in China is Alibaba Cloud's Moda Community. According to its operator Cheng Chen, Moda Community, which was established nearly a year ago, has more than 2,300 models, over 2.8 million developers, and the number of model downloads has exceeded 100 million.

It is understood that the standard version of Wenxin large model charges 0.8 cents per 1000 tokens, compared to OpenAI, which charges 0.002 dollars per 1,000 tokens. It is said that the cost of each question for ChatGPT is 0.03 dollars, far from enough to cover the cost.

Li Di, CEO of Microsoft Xiaobing, also introduced a similar case recently. In 2018, Microsoft Xiaobing provided sales assistance to Lawson, providing information to 20 million Lawson users. "Every time we drive (call) Lawson tens of millions of times, but Lawson only pays us 100,000 RMB, calculated based on the number of calls."

Therefore, it is difficult for API revenue to scale, especially compared to R&D costs such as training, the revenue from the API model is just a drop in the bucket. However, the API model can promote deep integration of platforms and ecological enterprises. Based on this, both Baidu and Alibaba have the ability to sell cloud services to customers, and this part of commercialization has more prospects.

At the same time, more commercial routes are also benchmarking OpenAI. Many domestic companies are also training based on the framework of ChatGPT and learning about the commercialization capabilities of ChatGPT in the C-end.

Among them, Mr. Kai-Fu Lee's Zero One large model has been controversial. After being exposed by the former chief scientist of Alibaba Cloud for "renaming development," many people believe that the Yi-34B large model is a shell. Some executives within Zero One have refuted this with the help of overseas open source personnel: The architecture is a product of academic research, and the dataset is trained from scratch, which is a common practice in the open source field.

ByteDance was previously exposed for illegally using OpenAI to train its own large models. In response, ByteDance stated that after the company introduced GPT API call specification checks in April, this practice has been stopped.

In fact, there are only three companies in China that have developed large models from scratch—Baidu, Alibaba, and Zhipu. The rest are benchmarking, with varying degrees of depth. Not only is there benchmarking in R&D, but commercialization is also benchmarking.

Wenxin Yiyuan's app also has an option for paid membership, which is a source of revenue for the C-end. Currently, the price of Wenxin 4.0 is 49.9 yuan for continuous monthly subscription, which is only about one-third of the monthly price of ChatGPT4 Plus at 20 US dollars. However, the number of Wenxin's members is not clear at the moment, and the intensity of market willingness to pay is unknown.

Charging the C-end is also the main commercialization model for ByteDance in the large model field. Based on the foundation of the Yunque large model, ByteDance has vigorously launched multiple C-end conversation products. ByteDance has launched the conversation product "Dou Bao" in China and the AI product "ChitChop" overseas. Dou Bao also has an overseas version called "Cici." Different business units of ByteDance are launching AI conversation products, among which the Dou Bao app is launched by the TikTok group, so there are many placements on TikTok.

Whether purely self-developed or benchmarked, both will win the market in the future. The key is who can establish a commercial ecosystem, which is the fundamental survival in the next stage of the elimination round.

Second faction: Open source vs. closed source, how to break through the encirclement?

In the mobile operating system, there are only two systems left: the open-source Android and the closed-source Apple.

In the competitive market of large models, there is also a competition between open source and closed source. Compared to the closed-source model of ChatGPT, more and more players participating in open-source large models are building a closed-loop ecosystem from R&D to commercialization, and their voices are getting louder. Especially when Llama 2 announced its open source in the middle of the year, it brought open source large models to the forefront of public attention.

However, in the foreign open source ecosystem, benchmark models such as Llama2-70B and Falcon-180B are "conditionally" open source, and the monthly active users of their customers cannot exceed one hundred million, and there are obvious shortcomings in Chinese capabilities due to the lack of training data.

This has given opportunities for the development of open source large models abroad, and Zhipu AI and Baichuan Intelligence are representatives of such start-up companies, while Alibaba Cloud is a representative of large companies.

Zhipu was the first to start the open-source free competition. On July 14, Zhipu AI and Tsinghua KEG announced the complete opening of ChatGLM-6B and ChatGLM2-6B, allowing enterprises to use them for free. According to Zhipu AI's official website, the private pricing for ChatGLM2-6B, which has no instance limit and no inference or fine-tuning toolkit limit, is 300,000 yuan for one year.

Wang Xiaochuan quickly followed Zhipu in the open-source free war, and Baichuan Intelligence successively announced the open-source and free commercial use of Baichuan-7B and Baichuan-13B. Although there were rumors in the market that Baichuan was based on LLM open-source large models, Baichuan's data training was well done, and it received good evaluations in the Chinese environment.

Overall, for start-up large model companies, free open source is a wiser choice. Relying on open-source large models with smaller parameter scales, start-up companies can open the doors to customer enterprises and market recognition.

After serving a certain number of enterprises, Baichuan later released closed-source large models. Wang Xiaochuan explained, "After the model becomes larger, we didn't choose the open-source approach, because the cost of deployment would be very high, so we use a closed-source model to allow everyone to call the API online."

On the other hand, Zhipu's commercialization thinking is quite unique. "Recently, I have been talking to some investors, and the most clear demand and scenario in the real AI landing process abroad is code assistance, which probably accounts for more than 50% of the willingness to pay," Zhang Peng said. "So we are also working on similar landing. We have our own code generation model, and currently we provide the ability to generate tens of millions of lines of code to over 100,000 users globally every day."

This is equivalent to pushing the capabilities of large models to the extreme, and this aspect happens to be a market demand, giving the large model the ability to make money. "Zhipu AI is working very hard on commercialization and may prepare to list on the Science and Technology Innovation Board in 2025," a source told AI Whale Selection Society.

Among the giants, Alibaba has taken up the banner of open source. On December 1, Alibaba Cloud announced the open source of the Qwen-72B 720 billion parameter model for Tongyiqianwen. So far, Tongyiqianwen has open-sourced four large language models with 1.8 billion, 7 billion, 14 billion, and 72 billion parameters, as well as two multimodal large models for visual understanding and audio understanding, claiming to have achieved "full-size, full-modal" open source.

According to public data, the QWEN-7B has been downloaded over a million times in just over a month, and on September 25, it was upgraded to QWEN-14B. The open-source Baichuan-7B and 13B large models from Baichuan Intelligence have already exceeded 5 million downloads, and over 200 companies have applied to deploy the open-source large models.

From the above data, it is clear that open-source large models still have good potential, and the market generally recognizes them. Of course, the core large model products of each company are closed-source, which is a common industry practice.

The only thing that can change this reality is that Meta is intensively developing a brand new open-source large model that supports free commercial use, with capabilities comparable to GPT-4 and a parameter scale several times larger than Llama 2, and plans to start training in early 2024.

The reason Meta firmly chooses open source is not only to continue its open source approach like Android, but also because its latecomer large model ecosystem must rely on open source to regain its position.

In the world of large models, the closed-source Apple is undoubtedly OpenAI, while the open-source leader is Meta or another company, which remains to be seen.

Third faction: General vs. Vertical, who is the future?

Currently, the threshold for general large models is high, and it is difficult to do self-development and ecosystem work without tens of billions in funding. Vertical large models can be as small as 10 billion parameters, and the development cost using open-source frameworks will be lower, which provides opportunities for start-up companies.

Therefore, after the large model craze for half a year, the industry has seen a debate about whether general or vertical large models are the future.

Recently, statistics from the media show that out of the 188 domestic large models, 27 are general, 16 are in finance, 13 are in industry, 12 are in media, 11 are in customer service, 10 are in medicine, 10 are in government affairs, 9 are in education, 9 are in scientific research, 28 are in universities, and 21 are in the internet.

This means that the few general large models with high research and development difficulty are in the minority, while the majority are vertical large models. Well-known vertical large models, such as JD.com's Yanxi and Jingyi, Ctrip's Xiecheng Wenda, and NetEase Youdao's Ziyue based on education, have been released.

Surprisingly, the most well-known company in the vertical large model route is Huawei.

At the World Artificial Intelligence Conference, Huawei's rotating chairman Hu Houkun explained, "Through the '5+N+X' three-layer large model, Huawei Cloud has built its own large model foundation. Pangu 3.0 provides customers with a series of basic large models with 10 billion, 38 billion, 71 billion, and 100 billion parameters."

Since the outside world has not had access to the Pangu large model, there are many discussions about its true capabilities. Huawei has not fully committed to general large models, but has chosen to jointly create vertical industry large models with the industry. Currently, based on the Pangu large model, Huawei has released seven vertical industry large models.

Some investors have expressed that general large models can be trained using vector databases, and they will perform well in vertical fields. In contrast, the barrier for vertical large models is training large models with industry-specific data.

For example, Haomo Zhixing has launched its own autonomous driving large model, DriveGPT. DriveGPT has a learning duration of 1.03 million hours and users have driven 87 million kilometers with assisted driving, and it is moving towards a data scale of 100 million kilometers. The platform uses all real road driving data and "translates" it into a unified language for the DriveGPT Xuehu·Hairo, using this data to train the autonomous driving large model, which OpenAI definitely wouldn't provide to ChatGPT.

Of course, it is difficult to build a framework from scratch for vertical large models, which is another important source of income for platforms like Wenxin, which is to help enterprises privatize large models. "The cost of deploying private large models is around four to five million yuan after three months of testing and can be put into operation in three to six months," an insider described the marketization of Baidu Wenxin's large models.

Zhu Xiaohu mentioned that at the beginning of this year, the deployment cost for private large models in China was 10 million yuan, then 5 million in the middle of the year, and now it's less than 1 million. Private large models, like private clouds, are a business transaction. The vertical large models born from this can be implemented in different businesses such as medical, transportation, and education.

For vertical large models, some people previously believed that while general large models could emerge intelligently, vertical large models would be difficult to achieve. However, industry voices believe that vertical large models, trained using unique data, are a blank area that even the most intelligent general large models would find difficult to enter.

In fact, the debate between general and vertical large models depends on the evolution speed of top players like ChatGPT. With the emergence of Q*, super artificial intelligence is not far away. Super artificial intelligence may quickly cover the capabilities of vertical large models, and if the evolution speed of large models is not fast enough, private data will determine that vertical large models perform better.

Fourth faction: Large model foundation, supporting AIGC business

In the past, although applications and products have begun to incorporate AI capabilities, the AI capabilities brought by large models are still impressive. In this wave of AI craze starting in 2023, companies that have upgraded their products with AI mainly fall into two categories.

The first category is giant companies that own multiple hundreds of millions of users of internet products. By upgrading with large models, these "traditional" internet products become more appealing. Representatives of these companies include Microsoft, Adobe, and Baidu.

Source: "AI One Year, Ten Years in the World" PPT

Microsoft, due to its continuous investment in OpenAI, has been at the forefront of large model applications. It was the first to upgrade the search application Bing to the large model version New Bing, and also applied large models to the Edge browser and Office suite, defining Copilot as a future system-level AI assistant.

Adobe, which has been significantly impacted by AI painting, also released three new-generation Firefly models at the Adobe MAX conference in March: Adobe Firefly Image2 Model, Adobe Firefly Vector Model, and Adobe Firefly Design Model, covering the fields of images, vector models, and design.

Since the release of the first Firefly image model in March of this year, the Adobe creative community has used the model to generate over 3 billion images, with over 1 billion generated just last month. Adobe has stated that not only is the diffusion model for generating images self-developed, but even the language model for text generation is also self-developed.

Unlike Microsoft's focus on dialogue and Adobe's focus on the image field, Baidu is restructuring almost all of its businesses through AI, including Baidu Search, Baidu Wenku, Baidu Maps, and Baidu Wangpan. Despite the different products, they are essentially intelligent searches, which is Baidu's original field.

Looking at the three companies, Microsoft has benefited the most from large models. On the one-year anniversary of the release of ChatGPT, Microsoft's market value has increased by about $800 billion over the past year. Adobe, which is striving to catch up with the AI wave, has seen its market value increase by about $170 billion over the past year. Baidu's market value has decreased by about $10 billion.

The second category is start-up companies that have their own industry-advantaged AIGC business and have subsequently launched large model businesses to make their original AIGC business more intelligent.

For example, the unicorn company Silicon Intelligence released the Emperor large model in May, training private domain knowledge using LLM large model technology, combined with Silicon AIGC digital human technology.

Recently, Silicon also completed the first AI transformation of a well-known popular science writer, Yan Bojun. Unlike the common 2D image replication and cloning, this time, over 1,400 popular science videos and over 1 million words of text from Yan Bojun were fed into an AI model for training, achieving the complete AI transformation of Yan Bojun's physical (image, voice, actions) and soul (knowledge, expression), essentially creating a digital life that can also evolve through AI. This is the power of large models.

Currently, AI Yan Bojun can efficiently disseminate popular science content to 12 million fans across the internet every day, and Silicon's AIGC digital human has also passed the Turing test, creating short videos and live streams in different fields.

Meitu's photo editing, design, and other core businesses are closely related to AGI. Therefore, Meitu released the AI visual large model MiracleVision 4.0 version in the middle of the year, and plans to gradually apply the MiracleVision 4.0 version to Meitu's products such as Meitu Xiuxiu, Meitu Beauty Camera, Wink, Meitu Design Studio, and WHEE apps in January 2024.

Meitu's visual large model updates almost every two to three months, achieving capabilities such as turning photos into cartoons and basic AI design for posters. However, from the incident where Google's multimodal large model Gemini was found to have manipulated promotional effects, including the challenging transition of ChatGPT 4 from a language large model to a multimodal large model, it is evident that the evolution of visual and other large models is not easy.

Large models are not the key to the revenue of AIGC companies, but they are a necessity for development. As an example, the valuation of the foreign company Jasper plummeted, showing that AIGC without large models will have no barriers, which has become a consensus in the industry.

On the journey of commercializing large models, each faction has shown its strengths. 360 previously announced that its large models had generated revenue of around 20 million in half a year. However, compared to the R&D investment in large models, this revenue is relatively small. The competition among the factions has not shown clear superiority or inferiority.

Of course, the commercialization model is not the ultimate barrier, as the model will be quickly copied, and only the core capabilities cannot be copied.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

派网:注册并领取高达10000 USDT
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink