8.23 Chinese Big Model "Top Stream Group Chat" Notes

What was discussed in the closed-door communication between Alibaba Cloud and the Chinese large-scale model "Half the Country"?

By Zhang Peng

In the history of domestic technological innovation, there has never been a "consensus in the tech community" established as rapidly as in the field of large-scale models in just a few months.

I entered the tech industry in 1998 and witnessed the changes in the PC era, the internet era, and the mobile internet era. I have never seen such a rapid "consensus-building speed." Take the Founder Park, a community of entrepreneurs at Geek Park, for example. Due to its early attention to the technological changes in the large-scale model field, it gained 150,000 new followers in just 4 months, and the community has expanded to several thousand members.

Just yesterday, the first batch of domestic large-scale models passed the filing process, igniting people's enthusiasm once again. Filing management means a relaxed policy towards the development of large-scale models, which also means that the commercialization and industrialization of large-scale models in China will truly begin.

However, the rapid formation of "consensus" can also be a cause for concern, because this technology is still in the early stages of development and cannot yet be widely applied.

Objectively speaking, if one believes that large-scale model technology has brought the dawn of AGI, then one must honestly acknowledge that its true productization and becoming a productive force is only just beginning. The know-how and problems experienced by entrepreneurial companies on the front lines are precisely the most valuable sparks that need to be gathered.

Based on this idea, Alibaba Cloud, in collaboration with Founder Park, invited more than 20 outstanding entrepreneurs in the fields of model layer, tool layer, and application layer of large-scale models to a face-to-face closed-door communication at the West Lake Wetland in Hangzhou.

Alibaba Cloud Chairman Zhang Yong gave this closed-door meeting a very good name - "West Lake Forum." This five-hour closed-door meeting, Zhang Yong sat next to me and fully participated in the entrepreneurs' group chat. I saw that his notebook was filled with several pages of notes.

On August 23, participants of the West Lake Forum took a group photo.

It is evident that Alibaba Cloud, as the computing power infrastructure layer, should connect and co-create with these several layers, and how to support entrepreneurs at various levels to use large-scale models well is the topic that Zhang Yong cares about the most. This shows that Alibaba Cloud has a completely different attitude from other domestic companies, and promoting the prosperity of the large-scale model ecosystem is what Alibaba Cloud cares about the most.

These are the most active and positive forces in the domestic large-scale model field. From two in the afternoon until nine in the evening, they had in-depth discussions and collisions from multiple aspects of the industry, and many insightful viewpoints were discussed based on their latest practices. According to them, they heard a lot of truth and "real feelings" during the discussion, which was worth sharing with everyone in this article.

01 Pay attention to large-scale models, and pay more attention to infra

Nowadays, anywhere in the world, the scarcest resources for large-scale models, apart from talent, are GPUs.

Wang Xiao Chuan, the founder and CEO of Baichuan Intelligence, shared that when he talked to friends in Silicon Valley, Nvidia's annual GPU shipments are 1 million, but OpenAI said they need to design a supercomputer with 10 million GPUs connected together.

So how many GPUs are enough, and is there a solution to the limited computing power?

Li Kaifu, Chairman of Innovation Works and founder of Zero One Technology, pointed out that although 10 million GPUs are a pipe dream, the "brute force" aesthetics of "great power brings great results" has a background. Richard Sutton, the father of reinforcement learning, pointed out in "The Bitter Lesson" that for the past seventy years, trying to add a bit of knowledge to AI, increase its capabilities, or adjust the model architecture, has been found to be of little value. The only force that has driven progress in AI over the past seventy years is a universal and scalable computing power. With enhanced computing power, algorithms and data progress accordingly, which is the background for the "brute force" approach.

Therefore, companies that emerge in this wave of large-scale models must first have computing power. Whether it is a few people or dozens of cards, it may be more practical to choose to call on centralized large-scale models.

"When there is relatively sufficient computing power, making good use of it under this premise can produce many things that cannot be achieved today using only open source or Llama2 (Meta's large language model)." OpenAI has set a new benchmark for models without considering costs, and Meta has paved the way for everyone. In the unpredictable and highly uncertain entrepreneurial environment of large-scale models, this is Li Kaifu's thinking on the new goals and practices of large-scale model companies.

What is this approach? How can one make one GPU perform the work of two or even three? This issue may require more attention to team composition. Li Kaifu believes that the Infra (hardware foundation) team must be stronger than the Modeling team. He said that soon everyone will find that people who work on large-scale model Infra are more expensive and scarcer than those who work on large-scale models, and those who can work on Scaling Law (model capability increases with training computational volume) are even scarcer.

Because an excellent Scaling team can avoid futile training. When training, there is a high probability of success, and if it fails, there is the ability to stop immediately and enough mathematical ability to do this. In addition, there are many subtle details and experiences. For example, reading academic papers can avoid many detours, as some papers intentionally write about things that do not work, and not reading them can easily lead one astray.

In fact, objectively speaking, the shortage of GPUs is not just a problem for Chinese entrepreneurs; it is a problem that all global entrepreneurs have to face. Therefore, how to make good use of limited computing power will be the key to the competition among large-scale model companies.

Li Kaifu mentioned a clear point: every position in the large-scale model team must have talent, including Pre Train, Post Train, Multi-Modal, Scaling Up, Inference, and so on, all of which are important. Among them, the Infra team is more scarce and should be given more attention.

In addition to the entrepreneurs' deeper understanding of large-scale models, more dimensions of technological innovation are also needed. For example, at the event, Wang Wei, the founder and CEO of Moxin, shared a computing solution - sparse computation. This made me see the possibility of optimizing the computing mode through cloud and terminal AI chip acceleration solutions, which can fully sparsify the development of neural networks and provide a universal AI computing platform with ultra-high computing power and ultra-low power consumption.

02 ChatGPT ignites enthusiasm, Llama2 keeps people grounded

If ChatGPT ignited the enthusiasm of many entrepreneurs, then Meta's open-source LLaMA and LIama2 have put the vast majority of entrepreneurs on an equal footing in the starting line of basic models. However, depending on their resource endowment and capability structure, entrepreneurs will clearly have different missions and visions for the future.

For entrepreneurs who still choose to develop basic large-scale models, the open-source foundation is just the starting point. Li Kaifu pointed out that although in various comparisons with SOTA models such as GPT-3 and GPT3.5, the gap between Llama2 is not significant. However, in practice, the capabilities of Llama2 today are vastly different from GPT-4 and the next version of Bard (Google's large language model).

This seems to give companies that develop large-scale models some room to maneuver. In the future, "truly wealthy" and "truly capable" large-scale model entrepreneurs have the opportunity to switch to a New Bard or New GPT-4 approach.

On the other hand, many entrepreneurs have expressed that Meta's open source has had a significant impact on the industry. "Today, xxx may still be the best model in China, but tomorrow it may be surpassed. Even one day, you may suddenly find that the models you have been training are basically useless. When there is a technological change or a stronger open-source model emerges, past investments may be completely 'wasted,' such as an open-source model that has been trained on ten trillion English tokens, and training your own model again may be meaningless." Li Zhifei, founder and CEO of Megvii, believes that the far-reaching impact of open source should be fully recognized.

"Although everyone has great ideals and ambitions, it depends on whether there is enough funding to support that day. So it is important to see that, being alive may be more important than anything else." Zhou Ming, CEO of Lanzhou Technology, also believes that many companies that originally wanted to develop the "best large-scale model" actually need to rethink their entrepreneurial position and choose to embrace open source, building "for my use" on the open-source foundation. For example, English open-source models are weaker in Chinese capabilities and have not been polished in industry scenarios and data, which is an opportunity for entrepreneurial teams.

In this regard, Lanzhou Technology uses the open-source model as the L0 foundation, and on top of this, develops L1 language models, L2 industry models, and L3 scenario models. Zhou Ming believes that by doing this layer by layer, interacting with customers through AI Agents to get feedback, and gradually iterating the model, barriers will be gradually established. Even if there are better open-source models in the future, there is a way to retrain or continue to iterate on its basis. "Open-source models 'raise the tide,' and you grow along with those who are better than you."

Using open-source models effectively is also a barrier and threshold. This may be different from what many people imagine. Some may even ask, is it still considered developing large-scale models based on open-source models? On the other hand, many companies themselves avoid discussing the use of open-source models.

In fact, developing based on open-source models has a high subsequent investment threshold and high skill requirements. Using open source effectively does not diminish the entrepreneur. For example, Li Zhifei's analysis suggests that an open-source model may have seen data from 10 trillion tokens, saving you several million dollars, and the model manufacturer needs to continue training the model. Ultimately, the model needs to be brought to the level of State of the art (SOTA), and each step, including data cleaning, pre-training, fine-tuning, reinforcement learning, cannot be omitted. The annual computing power may start at millions of dollars, and the threshold is not necessarily gone in one go, nor is it the case that using open-source models means no further investment is needed.

From this perspective, open-source models are a more practical choice, and optimizing and training practical models is the real skill. Based on open source, there is an opportunity to create excellent large-scale models, and the key is to have relatively advanced cognition and the ability to continuously iterate the model.

03 Current Status and Practice of Large-scale Models in ToB

Improving model capabilities is one thing, but applying them to customer scenarios is another matter.

From the customer's perspective, "large" is not the only pursuit of large-scale models, and it may not even be what the customer wants.

Some entrepreneurs shared particularly realistic customer scenarios: when talking to B-side customers, the customers only need language understanding, multi-turn dialogue, and some reasoning ability, and they do not want any other AGI (Artificial General Intelligence) capabilities.

Customers have expressed that other features bring trouble, and the "hallucination" problem cannot be solved. Additionally, the customers already have many AI 1.0 models that were working well, so why should they discard them? AI 2.0 does not need to cover the capabilities of 1.0, and it is good enough to be able to call them reasonably. This also explains why the introduction of large-scale models is most active in the RPA field, both domestically and internationally. Wang Guanchun, Co-founder and CEO of Laiye Technology, has verified this specific customer demand in the domestic market this year.

In this case, as long as natural language understanding is clear, passing parameters to call AI 1.0 models and external databases, the results are reliable, and the cost is relatively low. Finally, using large-scale models to assemble the results into a report plays a role in task distribution: dividing into sub-tasks, what each sub-task calls for. Some sub-tasks are supported by large-scale models, some are from the original statistical models, and some are not even their own, but rather a third-party model. What the customer ultimately wants is just to complete the task.

After finding this Product Market Fit (PMF), if only doing this type of ToB, where the model capabilities include language understanding, multi-turn dialogue, and a small amount of reasoning, a relatively small model of 10 billion to 100 billion is sufficient. Correspondingly, on the basis of several hundred cards, language understanding, multi-turn dialogue, and a certain amount of reasoning need to be done well, along with AI Agents, to basically meet the needs of customers in many scenarios.

A general large-scale model does not mean it can solve all problems. Many scenarios for B-side customers do not work well with general large-scale models. This means that more and more models are needed, with more convergent scenarios, and it also means that more effort is needed to help align technology and scenarios, rather than a universal technology to adapt to all scenarios.

Zhou Ming, CEO of Laiye Technology, believes that user data, industry data, and even graphs or rules need to be put into the model for further training, which is necessary for the existence of large-scale models in the industry. In local industries that cannot be covered by general large-scale models, adding such data can solve industry problems very well and overcome many hallucination problems.

I remember that Li Zhifei also added this perspective. He believes that general large-scale models and vertical large-scale models each have their own uses, and you can't have the best of both worlds. A particularly large model means that the cost of reasoning is very high. Moreover, it doesn't make sense for a large-scale model designed for chip design to answer questions about entertainment content such as movies and celebrities. He believes that ToB is more about being vertical and reliable, while general models are about intelligence, having strong reasoning and logical abilities, and rich knowledge. This may not be what ToB needs at the current stage.

At the same time, various industries in China have a strong demand for integrating large-scale models into their business. The founders and CEOs of Lanhoo and Moka, Ren Yanghui and Li Guoxing, have received recognition and actual revenue from customers after integrating large-scale models into their SaaS products.

By observing the changes in the status of these two entrepreneurs from March to August, I found that the technological changes brought about by large-scale models in the SaaS field are at the level of "redefining software," and daring to practice this "redefinition" process with a "fight to the death" mindset will basically dispel anxiety in a few months and give hope.

Therefore, entrepreneurs who have customers and scenarios in hand may be the earliest beneficiaries of the technological dividends of large-scale model entrepreneurs.

Because in specific scenarios, large-scale models actually have different pursuits. For example, Peng Jian, founder and CEO of Huashen Pharmaceutical Intelligence, believes that the hallucinations brought by large-scale models may be beneficial in the AI for Science field such as drug design. To some extent, the so-called hallucinations are the essence of intelligence in certain fields because they can help design protein combinations that people cannot think of.

As the company with the most frequent and fastest deployment of large-scale models in China, Zhijian AI's CSO, Zhang Kuo, believes that for the future value of large-scale models, "20% may be centralized, and 80% will be decentralized," meaning that using a more diverse and varied large-scale model specifically in customer scenarios will generate value, rather than relying solely on the infinite generalization ability of a single large model to solve all problems. This is an inevitable trend, and it has also been recognized by many entrepreneurs who participated in the discussion.

04 AGI is worth dedicating to, but don't "play to death"

Large-scale models are a watershed for AI. In the past, artificial intelligence pursued specific goals in closed systems, such as facial recognition systems pursuing 100% accuracy. But now, the "emergence" brought about by large-scale models is an open intelligence that generates various possibilities, exceeding the expectations of the designers. This is the true characteristic of intelligence and the biggest change in artificial intelligence in the past sixty or seventy years.

After the emergence of this new intelligent system, everyone will be able to obtain intelligence conveniently and at low cost, just like the electricity revolution.

Huang Tiejun, President of the Institute of Intelligent AI, believes that this technological change is spreading rapidly downward, from large companies to startups, and there is a rapid consensus: this is the beginning of a new era. In this era, not doing anything seems to be a disservice to this era and the development of technology.

In April, Baichuan Intelligence, which is currently the most active in large-scale model development in China, maintains an average pace of releasing a model every 28 days. Although Wang Xiao Chuan, the founder and CEO of Baichuan Intelligence, does not admit to being "active," he shared the secret to rapid deployment: for example, having an accumulated team in search technology is very helpful for data processing issues. Additionally, by introducing search enhancement, reinforcement learning, and other complementary full-stack technologies, the model can indeed be improved. "If you look at the backgrounds of the senior management of companies in the industry, you will find that many of those who have done well in technology have a background in search, which reflects the logic of some technologies gradually being understood."

However, Huang Tiejun believes that from a scientific research perspective, we are still only in the early stages of entering a great era. If we compare it to the era of electricity, today's era of intelligence is similar to when Faraday made a generator spin and generated electricity. Now, we are training intelligence from big data models, which is a stage. Later, we will need someone like Maxwell, because the establishment of electromagnetics is what made electricity reliable and usable in human society, and it drove the industrial revolution.

Today's large-scale models still have many black boxes. On one hand, the "ceiling" of large-scale models still has a huge potential for improvement, and AIGC often brings huge surprises. But on the other hand, the "floor" of large-scale models still cannot remain stable enough. At this time, understanding the technological boundaries, setting reasonable goals, and solving the problems that need to be addressed is necessary. Some people need to explore the upper limit, and some need to stabilize the lower limit.

For entrepreneurs, the dawn of AGI (Artificial General Intelligence) has appeared, and this is a career worth dedicating to, but don't "play to death."

On the other hand, while waiting for further advancement of large-scale model technology, many middle-layer entrepreneurs are improving the environment for large-scale models to be deployed in applications.

Liu Cong, the Asia Pacific region manager of BentoML, stated that compared to traditional machine learning, overseas customers can basically obtain some budget to develop large-scale model-related product prototypes or demos. However, they have not yet entered the production environment to generate commercial value for the company, and many middle-layer entrepreneurs see this as an opportunity.

The entrepreneurial insight of Zhang Luyu, founder and CEO of Dify.ai, also stems from this. He said that from the developer's perspective, having the model is not enough. He shared some data: after analyzing samples from over 60,000 applications, he found that the proportion of those currently in production or close to production is about 5%. Some are not satisfied with the model technology, and some teams' workflows have not adapted to AI application development. Accordingly, Zhang Luyu's team is working on specific capabilities for applications with a higher likelihood of being put into production. For example, they have an indicator called "consumer friction improvement," to see how much value AI can provide in this area and provide the corresponding capabilities.

Xing Jue, founder and CEO of Zilliz, added to this perspective. He believes that an extremely simple development stack is a prerequisite for democratizing AI, and based on this judgment, he proposed a development stack such as CVP (large-scale model + vector database + prompt word engineering).

05 How to move towards AI native?

What is the Killer App of the AI era? When Microsoft released Copilot in March this year, many people's curiosity was instantly ignited. However, at this closed-door meeting, Li Kaifu presented a different perspective: Copilot is not considered an all-in product for large-scale models.

He believes that looking at one of the most successful products of the mobile internet, WeChat, giving up compatibility is important. The earliest ones were MSN and QQ, but the winner was WeChat because Zhang Xiaolong made a decision: since it is the era of the mobile internet, there is no need for PCs. WeChat focused on the characteristics of the mobile internet in the early stages and bet 100% on the new technology platform.

From this perspective, AI native applications may have the following characteristics: if the large model is removed, the application will collapse, and it is an application that relies entirely on the capabilities of large-scale models. But if Copilot is removed, Office software is still Office, and AI is just icing on the cake.

This viewpoint received the most agreement from the entrepreneurs present and sparked discussions on AI native applications with this definition.

Zhang Yueliang, the product manager of the recently popular product Miaoya, believes that without large-scale models, there would be no Miaoya, which is consistent with Li Kaifu's thinking about AI first and AI native.

He believes that Miaoya, as an application that has emerged early, is most important for solving controllability. The Miaoya team initially did not intend to work on the underlying models, but focused more on how to achieve controllability using various plugins and small models developed by open source enthusiasts in the existing ecosystem. By anchoring the most important thing as controllability, Miaoya was able to achieve an average photo quality score of over 90 points, which led to rapid success.

"In the application layer, we particularly focus on how to make the model more controllable, and we found that there are already some relatively controllable technologies in the image track. If something like this appears in the language track, it will be a qualitative change for upper-layer application entrepreneurs," said Zhang Yueliang. His practice has provided some inspiration for companies developing large-scale model applications, and controllability may be a prerequisite for the birth of AI native applications. Stability. AI China Lead Zheng Yizhou also observed this trend, where after open source community contributors solved controllability, a large number of applications emerged.

In exploring the next generation of applications, Li Yan, the founder of Yuanshi Technology, pointed out that the reasoning ability brought by large-scale models is the essential difference in the new generation of products.

The combination of social media and agents is seen as a promising opportunity and is likely to be among the first batch of AI native products. However, this may require entrepreneurs to have the "end-to-end" construction capability from large-scale models to products. For example, when Li Zhifei discussed with Character.ai why the latter wanted to develop its own large-scale model, the response was that using centralized large models like OpenAI or Google would not answer "flirting" questions. This unique space found by Character.ai is also a gradually accumulating barrier.

In the same field, Lingxin Intelligence, in developing applications for social media using large-scale models, has discovered unique scenarios. Lingxin Intelligence CEO Zhang Yijia shared what they have observed, which is different from what was expected. Currently, large-scale models can be applied in social scenarios not for companionship, as people need time to accept virtual companionship. The current social scenarios where large-scale models can be applied are role-playing, where users' profiles are fans of online novels, and role-playing is a new form of online novel.

As for the latest AI Agent direction, whether large-scale models are the "hope of the whole village," and whether they will ultimately bring about a revolution in interaction, terminal, and business models, will likely depend on the development of multimodal capabilities.

Tao Fangbo, Founder and CEO of Mindful Universe, explained that initially, there were high expectations for Agents, but under the current technological conditions, it is difficult to explain how Agents solve more problems than ChatGPT. He believes that if Agents are to be effective, it is not just a matter of integrating many software APIs, as doing so essentially involves compatibility, which is putting new wine in old bottles.

Are there more native forms for Agents to complete the last mile? There are many things to do, such as spatial perception and multimodal capabilities, as mentioned by Song Zhen, Founder and CEO of Digital Xu Sheng. Once these conditions mature, a Killer Case may emerge.

Li Zhifei firmly believes that currently, multimodality is in the spotlight, not just a decorative feature. Because Agent input and output both depend on multimodal capabilities, without multimodality, there is no Agent. Today's Agents are more focused on feedback through language models and text, but ultimately, an Agent will be a multimodal observer, perceiver, and actor. He predicts that in two or three years, the transfer of cross-modal knowledge will be the biggest contribution of large language models.

06 The Era of Large Models: Serving Big B or Small B

A few months ago, I happened to attend the developer conference of the data company Databricks in San Francisco. This is a data platform company specializing in "data lakes" that has grown into a "middle layer" company on cloud computing platforms. In just a few years, the company's valuation has reached several hundred billion dollars and continues to grow. Databricks serves both large enterprises and small startups, catering to all sizes of businesses.

This year, the company quickly integrated large-scale models and even acquired the large-scale model company Mosaic ML, starting to help customers deploy large-scale models into their businesses. This trend has propelled the company towards a valuation of hundreds of billions of dollars.

I was very curious about why there doesn't seem to be a similar "middle layer" company based on cloud computing in China, and whether the variables of this wave of AI technological advancement could give rise to a batch of excellent companies that turn cloud computing power into business competitiveness, bringing digital progress to more industries.

Alibaba Cloud Chairman Zhang Yong believes that the emergence of "middle layer" companies is definitely possible and is something that cloud computing companies would welcome. However, these companies still need to solve a core problem—clearly defining whose problems they are solving and for what purpose. The clearer the definition, the more capable they are, and the more they can truly "converge" and have real business "penetration."

This also sparked discussions among the attending entrepreneurs. For example, as large-scale model technology has just begun to enter the industry, issues with the lack of convergence and project-based services for enterprise clients have already emerged. For example, training large-scale models for B-side users, but because the data belongs to the other party, it is difficult for one's own team to "close the loop"—without a flywheel for the data, the gross profit is also low, and it is easy to inadvertently become a "high-tech construction team," which is a common problem for technology companies facing B-side clients. Some entrepreneurs have even begun to doubt whether large-scale models for B-side clients may inherently lack a suitable environment.

But Zhang Yong, who has been taking notes in the group chat, systematically expressed a different perspective: "In fact, there is another possibility for To B, which is 'Small B,' referring to small and medium-sized enterprises. They may seem inconspicuous, but in large numbers, serving them alone can create today's internet giants."

For example, Alibaba's early "Yellow Pages" allowed small sellers to be seen by foreign buyers, leading to the prosperity of cross-border trade. Taobao solved the problem of information and logistics circulation, creating a major category in e-commerce.

Moreover, compared to large companies, these Small B companies do not care about technology and vision. They will pay for whoever can help them solve growth problems.

The current data-driven approach of large companies mainly aims to "reduce costs and increase efficiency," which essentially means "cutting costs." However, the space for efficiency optimization always has its limits, while the space for "open source" growth and development is relatively unlimited. Zhang Yong believes that in enterprise services, "open source" is far more important than "cost reduction," as people are always willing to pay for development.

He even believes that the past emphasis on "cost reduction and efficiency improvement" in digital enterprise services may have been a misconception, as it is often large companies that are willing to pay for a few percentage points of efficiency improvement, as they have a large scale and the improvement aligns with the input-output ratio. This has led everyone to focus on projects for large companies. However, small companies find it difficult to initiate demand through "cost reduction and efficiency improvement"; what they need is the ability to grow and develop.

In fact, Small B customers have a dual nature, as if they are subscribed, they can actually be seen as "C-end users."

On this point, Zhang Yong's viewpoint was also acknowledged by the attending entrepreneurs. For example, Li Zhifei, who used to work on To B business in the field of speech recognition, found it very painful to be rolled by peers. Later, with the AI voice dubbing tool "Magic Sound Workshop," he served individual content creators, converging into a product that truly solves common problems for Small B companies, allowing him to truly turn AI technology into a healthy and growing business.

Zhang Yong also suggested that startup companies need to clearly define the customers they want to serve from the beginning, whether it is C or B, Small B or Big B. It must be well defined. Zhang Yong even believes that for companies doing AI, it is not feasible to serve both Big B and Small B, or even C.

Although the development of AI technology has brought about many changes and will increasingly have universal capabilities, there are also organizational "DNA issues" beyond the technological level. "The way people dress and speak at work may be different for a company serving large clients and a team serving internet users," Zhang Yong believes that it is important to clearly define who they are serving and what problems they are solving, rather than just following the money.

07 What Does Large Models Mean for Cloud?

In the previous wave of AI, many startup companies received a lot of funding and saw the emergence of many well-known companies and entrepreneurs. However, after several years, they still found it very difficult. I have kept in touch with many entrepreneurs from that wave, meeting up for a chat many times, and seeing them looking tired and hoarse, often because they had spent the previous day drinking heavily with a large client and hadn't recovered.

The entrepreneurs at this event also witnessed that era, where technology ultimately failed to form standardized products and instead became a "high-level human outsourcing" or a "high-tech construction team" that could only take on projects. They all felt that they must not repeat the same mistakes.

At the same time, everyone was also very concerned about how cloud computing platforms like Alibaba Cloud would face changes in the era of large models. They also asked Zhang Yong what he thought about the cloud itself in the era of large models—whether it is technology or a product.

Zhang Yong's response was very direct: "The cloud itself should be a product, and not just one, but a series of products." Driven by the wave of large-scale models and AI, one thing is certain: the industry and customers have placed entirely new demands on computing power. How to meet the further demand for computing power from customers has become the basic starting point for Alibaba Cloud. Zhang Yong believes that there are definitely technical problems to be solved, but Alibaba Cloud also needs to consider how to "converge" into products that truly solve industrial ecosystem problems, rather than just outputting computing power itself.

Interestingly, although this exchange event was jointly organized by the FounderPark community and Alibaba Cloud, there was no arrangement for any sharing about "Thousand Questions of Tongyi." Of course, entrepreneurs are also very concerned about the cloud platform's own development of large-scale models. Zhang Yong's view is that in such a transformative era, Alibaba Cloud needs to grasp a more core role, which is as a Cloud Service Provider.

"To fulfill this role, not understanding large-scale models is definitely not enough," Zhang Yong said. "If we don't understand Thousand Questions of Tongyi, we may not be able to figure out how to help the entrepreneurs attending today's event."

What excites Zhang Yong the most is his strong belief that the future demand for computing power in human society is unlimited, and the requirements for its efficiency will only increase. Therefore, Zhang Yong said that Alibaba Cloud definitely hopes that "the more models and scenarios, the better." The more there are, the higher the demand for computing power and technical requirements, which means that the cloud has new problems to face and solve. Only by continuously addressing "difficult problems" can the cloud's value have greater room for growth.

"Cloud computing platforms need an ecosystem like never before, rather than trying to do everything on their own. Currently, no company can use its own chips, cloud computing, data platforms, machine learning frameworks, and large models to form a so-called 'closed loop,' which is almost physically impossible."

Zhang Yong believes that the development of AI technology has opened up new possibilities for ecosystems. He has a regret that the past decade was a period of rapid development for cloud computing in China, but the country's SaaS industry did not see a fundamental improvement due to the rapid development of infrastructure. Meanwhile, SaaS companies in the United States are currently exploring the integration of AI into platforms, taking a different path from Chinese companies.

He believes that in the AI era, China may see a new generation of SaaS, which will be a completely new intelligent service. Unlike the previous SaaS process-driven approach, this new service will be driven by data and intelligence, and may not even be called SaaS.

Li Dahai, Director and CEO of Mianbi Intelligence, pointed out that the B-side market in China is very fragmented, which is the reason why SaaS services have not taken off. However, with the emergence of large-scale models as a technological variable, there is hope for some changes, which is something worth looking forward to. At the same time, he also hopes that cloud providers like Alibaba Cloud can provide good solutions and support for this, helping everyone move forward together.

In Zhang Yong's view, many SaaS companies in China have not been able to safely be considered Cloud Native, and for a new type of service that naturally grows in the cloud, or is intelligent native, there is an opportunity to "replace" the products of the previous non-native era.

Many times, we lament that the growth of China's SaaS industry in the past decade has been unsatisfactory, but large-scale models now provide new opportunities for startup companies to shape new patterns in a completely new digital ecosystem. Zhang Yong's conclusion is that such opportunities and challenges are common to Alibaba Cloud and all entrepreneurs, and everyone needs to find their position for the future, forming ecosystem partnerships and creating value together.

Well, the above are some excerpts from my notes during the 7-hour discussion. My strongest feeling is that the era of change brought about by large-scale model technology has just begun. After the extreme excitement and "excessive imagination" of the first half of the year, a technological revolution that may last for 10 years has only just begun the "Long March." After the period of enthusiasm, we are truly entering a pioneering period, and the "consensus" that can only be achieved through sufficient time and solid effort is the real consensus.

I hope that there can be more open and honest communication and thoughtful collisions with the "spirit of open source" among entrepreneurs and industrial ecosystems. In fact, the name "Xixi Forum" given by Zhang Yong to this exchange is quite fitting. Sitting and discussing is important, but it is even more important to take action.

I think this "path" should be the "innovation path" from technology to products, from vision to value in the AGI era.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

8.23 Chinese Big Model "Top Stream Group Chat" Notes

Selected Articles by 巴比特

Table of Contents

Related Articles