Big models are not the pets of giants.

CN
巴比特
Follow
1 year ago

Article Source: Huxiu


Author: Xiaoxi



Image Source: Generated by Wujie AI


The battle of large models is intensifying, but many practitioners have more and more questions about large models.

In the field of basic large models, several Internet giants such as Tencent, Alibaba, and Baidu have entered, and Alibaba, Baidu, iFlytek, and other major companies have successively released the latest versions of large model products in the past month, and their technical capabilities have also been greatly improved; while iterating new versions, startup companies are also "crazy" about financing. Zhipu AI announced a cumulative financing of 25 billion yuan recently, and Baichuan Intelligence, established for nearly half a year, has received 350 million US dollars in financing. Among the investors of these startup companies, there are also Internet giants such as Tencent, Alibaba, and Meituan.


Does China really need so many basic large models? Various basic large model companies are "rolling" technical parameters frantically, but what kind of large models does the market really need? In the lively and chaotic situation, more and more people are raising these questions.


To answer this question, we need to first understand how basic large models make money in the Chinese market. Although the vast majority of people perceive large models as chatbots, many users have started using these products when searching for information and organizing documents. However, it is difficult for related companies to make money from these consumer-end products, and even the larger the user base, the more losses the company suffers. Currently, the most practical direction for the commercialization of basic large models is still in the B-end, serving enterprises in the retail, financial, manufacturing, and other fields to reduce costs and improve efficiency in order to obtain stable commercial returns.


Enterprises' demand for basic large models is nothing more than 3 types: directly calling large model APIs to obtain related large model capabilities; secondary development based on large models tailored to actual business needs; developing AI applications based on large models. These demands test the technical capabilities of basic large model platforms and even more so their enterprise service capabilities.


In terms of service capabilities, both basic large model startups and Internet giants have to start from scratch, with no inherent advantages. Platforms that can quickly understand customer needs and provide stable and reliable services can stand out.




Large models are not "blown" out




With the popularity of ChatGPT as a dividing point, the development of domestic large models has experienced both good and bad times.


Before the popularity of ChatGPT, only a small number of companies in China were engaged in the development of basic large models because the technology and service capabilities of large models had not been widely accepted by the market. These companies mainly focused on technical research and service capability accumulation. After the popularity of ChatGPT, a large number of investors and practitioners entered the market, and large models became a new trend.


A hot market easily breeds speculators. They do not delve into the technology, but tell stories and hype concepts. Regardless of the technology and service capabilities, they first tell a fantastic story and gain recognition from the capital market and customers. A large model practitioner jokingly told Huxiu that many domestic large model companies claim that they are only a few months away from GPT-4, because they have trained on the basis of the open-source GPT-2 and then come out to tell stories.


In fact, the evolution of large model technology capabilities cannot be achieved by training for a few months, because this is a complex system, and scale is very important. Without reaching a certain scale, more intelligent evolution cannot be achieved. Increasing the training scale of large models requires a lot of time and repeated debugging. Technicians who have debugged large model training parameters understand this difficulty: no one tells you what to do, you can only explore it yourself, and various unexpected situations will arise in this process, all of which require time to resolve.


In China, the basic large models that truly have confidence in their technical capabilities started training before the popularity of ChatGPT. At that time, large models were not well known, and many people could not understand or have confidence in large models, but the companies that persisted in investing in large models were very confident in new technologies.


For example, the Zhijing Research Institute initiated the first ultra-large-scale pre-training model research project "Wudao" in 2020, and its 2.0 version once became the largest trillion-level model in the world. After the upgrade this year, "Wudao" covers basic large models such as language, vision, and multimodal, and has entered a comprehensive open-source stage.


Zhipu AI also developed the GLM pre-training architecture in 2020 and trained a model GLM-10B with a scale of 100 billion parameters. On October 27, Zhipu AI released its self-developed third-generation conversational large model ChatGLM3, which has significantly improved performance, reasoning ability, and context capacity compared to the previous generation. Compared with ChatGLM2, ChatGLM3 ranked first in 44 public datasets in Chinese and English. Among them, MMLU increased by 36%, CEval increased by 33%, GSM8K increased by 179%, and BBH increased by 126%.


In addition, in terms of functionality, several domestically developed large models released by Zhipu AI (ChatGLM, CodeGeeX, WebGLM, CogVLM, etc.) are currently the most complete counterparts to the OpenAI series of large models in China, and are applicable to the generative AI assistant "Zhipu Qingyan."



These companies that were among the first to develop large models have a fundamental difference from those that speculate and chase trends. When large model technology had not yet become popular and the market was not so active, they got involved because they understood the value of basic large model technology and the business logic. This difference became very apparent after the popularity of large models. Many companies got involved in consumer-end products for traffic and visibility, while companies like Zhipu AI that were among the first to get involved in large models focused more on the enterprise service field. All their R&D and service capabilities are also based on this strategy, steadily accumulating and developing towards creating value for customers.


The complexity of large models determines that companies that have accumulated technical and service capabilities for a longer time have stronger advantages. As more and more people in the market realize the complexity of large models and the time required for their evolution, large model companies that rely on storytelling and hype will have less and less room to survive, and only companies that seriously accumulate technology and service capabilities can withstand the first wave of competition.




Without a thriving ecosystem, large models have no future




In the process of commercializing large models, those who can land in application scenarios with social needs first will be able to form a self-sustaining virtuous cycle.


General large models have a wider range of applications, but they are not professional enough to solve specific problems in vertical fields. Vertical large models have stronger capabilities to solve specific problems in specific fields, but their service scope is very limited, which makes it difficult for many vertical large models to achieve a balance between cost and commercial returns, and their development space is limited.


The ultimate goal of large model applications is to be used for life and production, to solve practical problems in work and life, and to improve work efficiency and productivity. Considering the advantages and disadvantages of current general large models and vertical large models, a more suitable approach in the current commercialization process of large models is to use general large models as the foundation, open up technology and service capabilities to retail, financial, manufacturing, and other fields, and cooperate with general large models and related enterprises to jointly build application scenarios.


Constrained by data, computing power, and scenarios, there will not be too many large models that can run open source. At the same time, as a basic technological foundation, the role of large models is very similar to PC and mobile operating systems, and will present a competitive situation of "under the big tree, no grass grows," where one or two technological foundations dominate the industry, and all application developers have to develop based on these one or two technological foundations. If the foundation large models cannot form a thriving ecosystem, they will not have the ability to sustain development.


Looking at the development history of PC and mobile operating systems, the first-mover advantage is very important. When Windows dominated the PC market and iOS and Android divided the mobile field, it became difficult for other operating systems to turn the tables.


The same trend is evident in the large model field. Large models will open up a thriving AI application ecosystem, where personal and enterprise data, capabilities, or applications can quickly become AI plugins, enhancing the capabilities of large models and making them more practical and user-friendly.


Currently, giants such as Baidu and iFlytek are committed to ecosystem construction. Baidu's Intelligent Cloud Qianfan Large Model Platform 2.0 has nearly 10,000 active enterprises, covering more than 400 scenarios in industries such as finance, education, manufacturing, energy, government affairs, and transportation, and iFlytek's Xinghuo Large Model Platform has a developer base of over 700,000.


Some startups that have accumulated a long time in the large model field were among the first to reap the benefits. Zhipu AI currently has over 1,000 customers and over 100 partners in co-building the ecosystem, covering multiple scenarios such as media, SaaS, education, and office. For example, behind the capabilities of WPS intelligent document generation for presentations and writing news articles, there is the technical support of Zhipu AI.



In the ecological competition of various large model platforms, the platform's value to partners and its ability to grow together with partners directly determine the direction of the competition. In the office scenario, the accuracy and reasoning capabilities of large model platforms for generating presentation content, writing article content, and style rewriting are very high. Only large models that reach a certain technical level have the ability to support these applications, and large model platforms also need to iterate based on user feedback in actual applications.


Whether it is a giant company or a startup, even with strong financial and resource capabilities, they still need to accumulate and iterate step by step from scratch. Therefore, in the process of building the ecosystem, the advantage of time is very important. This is also the reason why startups with first-mover advantages and internet giants with stronger financial and resource capabilities can compete on an equal footing.




The battle of large models: Who is more suitable for the Chinese market?




Although the battle of large models is lively and chaotic, the direction of competition behind it is very clear. The ability to build technology and service capabilities, as well as the ability of large model platforms to construct ecosystems, directly determine the direction of the competition.


Building these capabilities requires time to accumulate and is difficult to achieve overnight, but time accumulation alone is not enough. In addition to the time difference brought about by early action, the first-mover advantage also includes the ability to accurately perceive market demand, that is, to act firmly and quickly along a correct strategy. Strategic swings and taking detours can easily consume the time advantage accumulated by early action.


As more and more large model platforms shift their focus to ecosystem construction, the platform's strategic determination and execution in ecosystem competition will become increasingly important. Whoever can efficiently complete the accumulation from 0 to 1 in various fields and scenarios will have a more obvious advantage. And after a few platforms complete the qualitative change to become super platforms, the competitive landscape will be basically determined.


In the large and complex domestic market, B-end service companies are prone to strategic swings and taking detours. On the one hand, companies in the domestic market are in different regions and have different operating scales, resulting in a significant difference in their understanding of the value of large models for enterprise intelligence. They are willing to invest different resources and costs, making it difficult to find a standardized solution. On the other hand, companies in different fields have different demands for large model capabilities, and even different companies in the same field have varied demands for large models. Under the tug of different demands, large model companies are prone to transform from basic technological foundations into project outsourcing companies, and it is difficult for them to become true super platforms.


In such an environment, compared to OpenAI's commercialization solution, the commercialization landing of domestic large model platforms needs to pay more attention to details. We can see this trend in the commercialization strategies of some platforms.


For example, in addition to the common open platform API services, Zhipu AI also provides two solutions: cloud privatization and local privatization. Cloud privatization can assist enterprises in building exclusive large models based on private data, with stronger security. Local privatization is a unique solution in the Chinese market, providing not only more powerful models, but also a complete model matrix to meet various scenarios and demands.



For different customer demands such as text generation, intelligent customer service, and data annotation, as well as the demand scale of enterprises of all sizes, Zhipu AI provides different solutions that customers can freely combine according to their needs. This more detailed and flexible service model is also based on long-term accurate insights into the Chinese market.


Facing the uncertainty of the external environment, Zhipu AI has also launched a domestic chip adaptation plan, cooperating with domestic hardware manufacturers and chip manufacturers to provide different levels of certification and testing for different types of users and chips, making large model services more secure and reliable. Currently, the ChatGLM series supports over 10 domestic hardware ecosystems, including Ascend, Sunway Supercomputing, Haiguang DCU, Haifike, Muxixiyun, Alnener Technology, Tianshu Zhixin, Cambricon, Moore Threads, Baidu Kunlun, Lingxi Technology, and Great Wall Super Cloud. The concurrently released ChatGLM3-1.5B and 3B models that can be deployed on mobile phones support a variety of mobile phones and in-vehicle platforms such as Xiaomi, Vivo, Samsung, and more.


In the increasingly intense battle of large models, these seemingly inconspicuous details become more important, because these details determine the level of recognition from external partners and also affect the speed of large model landing in different scenarios. Simply releasing a large model does not have as high a market expectation as having high-quality data scenarios, which can sustain iteration and form competitive barriers. The key to high-quality data scenarios lies in external partners—making it easier for more partners to choose the platform, thus running this business cycle.


In this competition, many practitioners believe that the winner must be the giant companies with stronger resource and financial capabilities, but that is not the case. Both startups and giant companies need to work hard and focus on details, without any shortcuts. As for funding, it is not the fundamental factor determining the outcome of the battle, because startups with core competitiveness are not short of money—even though Zhipu AI has currently received the highest amount of financing among large model startups, there are still more new investors wanting to enter the game.


If we look at it from a different perspective, the capital market has already voted with its feet on who is more suitable for the basic large model for Chinese enterprises.


免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

HTX:注册并领取8400元新人礼
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink