Reviewing AI Agent: Pursuit, Implementation, and the Turning Point for To B in 2024

CN
巴比特
Follow
1 year ago

Source: Industrialist

Author: Dou Dou

Editor: Pi Ye

Image

Image Source: Generated by Wujie AI

"Among ten AI applications, five are office agents, three are AIGC, and the remaining two are rejuvenating digital humans." So, is Agent really the ultimate product of large-scale AGI?

In April of this year, researchers from Stanford and Google jointly created a "Westworld simulation", where 25 intelligent agents can perform human-like daily activities in this town, such as getting up in the morning, making breakfast, going to work, artists painting, and writers writing.

Image

These are the so-called "AI Agent experiments" that people are talking about today. In the second half of this year in China, players in the large-scale model market seem to be turning to AI Agents in unison - a clear and visible ultimate AGI product.

According to a set of data, as of mid-November, there have been 13 financing events in the AI Agent track, with a total financing amount of approximately 73.5 billion RMB, and the average financing amount for companies is 5.654 billion RMB.

In foreign countries, this field is also booming. "At least 100 projects are dedicated to commercializing AI agents, and nearly 100,000 developers are building autonomous agents," said foreign media Matt Schlicht.

Why is AI Agent so popular?

One imaginative answer about AI Agent is: "A large language model can only code a snake game, while AI Agent can create a game like King of Glory."

Mature AI Agents can significantly reduce the cost of software production. In the future, there will be many temporary software and testing solutions written by agents in the coding workflow, without pursuing long-term reusability, and can be discarded as needed. Currently, a software industry giant often requires tens of thousands or even hundreds of thousands of people, but with AI Agents, the manpower and funds required for research and development and delivery will be greatly reduced. It also allows software to flexibly address more niche demands.

In addition, AI Agents may build a framework for LLM to engage in deep thinking and analysis, in order to make more complex and reliable decisions.

In short, as Microsoft co-founder Bill Gates said, "Whoever can dominate personal assistant agents is the real deal. Because you will never search the web, never go to productivity websites, and never go to Amazon."

It is worth noting that, under this huge technological change, we have not yet experienced the dividends and changes brought about by AI Agents. Obviously, the development of AI Agents still faces some challenges.

Some questions worth exploring are: What is the current development status of AI Agents at home and abroad? What are the key points for the landing of AI Agents? And what will be the future of AI Agents?

Current Status of AI Agents, Overseas vs. Domestic

Currently, some domestic technology companies have produced several well-known large models, and the applications of Agent intelligent entities born from them have gradually entered the public's view.

For example, Baidu has applied the Wenxin large model to intelligent search and autonomous driving; Alibaba has applied the Tongyi Qianwen model to products such as Gaode Map and Youku; Huawei has applied its Pangu model to intelligent weather and speech recognition.

A startup called Mianbi Intelligence has also launched their AI Agent product ChatDev, which can complete the development of a software or a small game in a short time. All the user needs to do is provide a requirement.

Image

It is worth noting that the field of collaborative office seems to be the "must pass" for giants to develop AI Agents.

For example, in the DingTalk Magic Wand kit, various AI capabilities are integrated, such as chat AI, document AI, meeting AI, Yida AI, and Teambition AI; the "meeting assistant" function in Tencent Meeting provides some intelligent support, such as automatically summarizing meeting minutes, transcription, and translation; Baidu's intelligent work platform such as Flow carries the Wenxin large model, which can achieve intelligent creation and recommendation functions; ByteDance's office software Feishu announced the launch of the intelligent AI assistant "MyAI" to improve team collaboration efficiency.

An investor once joked to the media, "Among ten AI applications, five are office agents, three are AIGC, and the remaining two are rejuvenating digital humans." This is not only the current development status of AI Agents in China, but in fact, some foreign companies such as Google and Microsoft are also deploying AI Agents in collaborative office scenarios.

In fact, overseas, the concept of AI Agents has gone through several stages from emergence to explosion.

In the single Agent stage, it mainly targets specific tasks in different fields and scenarios, and develops and deploys dedicated intelligent entities. Taking GPTengineer as an example, it can roughly write code given a requirement.

In the multi-Agent cooperation stage, it is completed by Agents of different roles working together to accomplish complex tasks. For example, on MetaGPT, if it is asked to create a stock analysis tool, it will translate this task to five roles such as product manager, architect, project manager, simulating the entire decision-making workflow in software development.

However, with the release of the new tool AutoGen by Microsoft, AI Agents quickly opened a new chapter.

AutoGen allows multiple LLM intelligent entities to solve tasks through conversation. LLM intelligent entities can play various roles, such as programmers, designers, or combinations of various roles, and the conversation process solves the task. This is different from MetaGPT, where the role model is predefined, while AutoGen allows developers to define Agents themselves and let them converse with each other.

Image

This is a new and creative Agent framework. Within two weeks of AutoGen's release, the star count surged from 390 to 10,000, and attracted over 5,000 members on Discord.

Microsoft has been early in the layout of AI Agents. In March 2023, Microsoft 365 Copilot was released, which hinted at an application development paradigm based on LLM, namely Agent. Currently, Microsoft Copilot Studio supports seamless integration of custom ChatGPT assistants into CRM, ERP, OA, and other daily office systems.

It can be seen that Microsoft's AI Agent capabilities are mainly derived from its own business, and AutoGen is more like an externalization of the capabilities built on its own business, which is different from OpenAI.

The GPTs developed by OpenAI, as well as the introduction of GPT-4 Turbo and customizable AI Agents, allow everyone to build their own large model applications. Many industry insiders believe that the extremely low creation threshold and the APP Store-like business model will enable OpenAI to quickly build a GPTs ecosystem.

Image

OpenAI provides the capability to build basic Agents, such as tool invocation and knowledge base file memory. The release of this product has brought AI Agents into another new stage, making it possible for everyone to create their own Agent.

It is worth noting that AI Agent architectures and products have already appeared in various fields overseas, including retail, real estate, tourism, customer service, human resources, finance, and manufacturing.

For example, in the retail sector, there are Amazon Alexa, Aktify, and Regie.ai; in the real estate sector, there are Epique, propertypen, and Listingcopy; in the customer service sector, there are Agent4, Ebi.Ai, JasonAI, and Aide; and in the human resources sector, there are AutonomousHRChatbot, AIInterviewCoach, and CareersAI.

Overall, AI Agent has been relatively mature in terms of underlying technology, architecture, and specific product applications. Tech giants like OpenAI, Microsoft, and Google have a first-mover advantage. Another observable phenomenon is that there is still a gap in the depth and breadth of domestic AI Agent development.

One question worth considering is, what is the key to the landing of Agents?

Key to the Landing of Agents: Model? Industry Experience? or Carrier?

Most Agents currently on the market, including the GPTs launched by OpenAI, are essentially building chatbots based on specific knowledge bases or professional data. These intelligent entities are mainly used for question-and-answer interactions, such as obtaining industry information and reports.

However, there is still a lot of room for improvement in terms of program linkage and operation. Currently, we cannot directly use GPTs to operate ERP systems such as SAP or Kingdee, because this involves the application, authorization, maintenance of APIs, and the issue of connecting to API management software.

For enterprises, if GPTs and other AI intelligent entities are only used for knowledge Q&A, their role will be very limited, like a toy, because they cannot currently penetrate deeply into the business processes of enterprises.

There are many reasons behind this, including model capabilities, industry experience, and scenario fit, all of which will affect the ability of the Agent.

AI Agents need to have the ability to perceive the environment, make decisions, and execute appropriate actions. In these key steps, the most important thing is to understand the content given to the Agent, reason, plan, make accurate decisions, and convert them into executable atomic action sequences to achieve the ultimate goal.

Currently, many studies use LLM as the cognitive core of AI Agents, providing quality assurance for completing this step. Therefore, Agents based on GPT-4 perform more intelligently.

But as it stands, including GPT-4, all large models still need to improve their capabilities.

Image

"The problem with the base model is still significant, and we need better models for AI Agents to truly land," said an industry insider in the forefront of large model technology to Industrialist.

However, in response to the issue of insufficient model capabilities, Zhijian AI & Tsinghua KEG have proposed a fine-tuning method called AgentTuning to align the capabilities of Agents. This method uses a small amount of data to fine-tune existing models, significantly enhancing the Agent capabilities of the models while maintaining their original general capabilities.

Image

Industry experience for AI Agents is also crucial for their landing.

"If a paper proposes a different training method, it will be sneered at on the internal Slack of OpenAI, because these are all things we have played with. But when a new AI Agents paper comes out, we will seriously and excitedly discuss it," said OpenAI co-founder Andrej Karpathy in a recent speech.

In short, what we can do based on large models ultimately depends on industry experience, which is exactly what many large model giants like OpenAI lack.

It should be noted that for enterprises to introduce AI Agents for process optimization, strict and meticulous evaluations are required in various aspects such as cost control, budget allocation, efficiency realization, and security control. This requires technology suppliers to provide platform-level solutions, rather than just providing AI Agent automation solutions for single or specific scenarios.

Large enterprises do not allow any trial and error costs when introducing new AI technologies, so the solutions provided by technology suppliers must be ready to use, with real intelligent digital employees possessing industry know-how terms and business rules. Only such standardized AI Agents can be integrated into the internal organization of enterprises for unified management and scheduling.

For example, an AI Agent in the medical industry needs to have medical knowledge and be able to understand and process medical data. An AI Agent in the financial industry needs to have financial knowledge and be able to understand and process financial data.

The landing effect of AI Agents is also limited by the application scenario. In areas such as travel booking, AI Agents perform well thanks to rich APIs. However, in scenarios such as legal assistants, the frequent emergence of new knowledge and imperfect APIs pose more challenges in practical applications.

This can be seen from the fact that domestic AI Agents are growing in collaborative office platforms.

In fact, collaborative office platforms themselves have good API interfaces and plugin systems, making it easier to integrate large models into existing tools.

In addition, many enterprises and organizations are using collaborative office software, which means that large models can quickly cover a large number of potential users. A wide user base can accelerate the iteration and optimization of large models, making them better meet user needs.

There are also abundant data resources to improve the performance of models, and rich scenarios can drive continuous improvement in large model technology.

DingTalk, Feishu, and WeChat Work each have their own advantages as carriers of Agents. DingTalk provides comprehensive organizational structure management functions, making it easy to create, manage, and adjust team structures, allowing enterprises to quickly build organizational structures that meet their needs.

Image

Feishu emphasizes real-time collaboration and communication, supporting functions such as multi-person online document editing and joint discussions, helping teams efficiently complete collaborative tasks. Its unique integration makes the entire office process more standardized.

WeChat Work is interoperable with WeChat, which allows its AI Agents to leverage the vast user data and application scenarios of WeChat to provide more personalized and scenario-based services.

From this perspective, it is natural for domestic AI Agents to cluster in the collaborative office field. Finding a suitable landing scenario or carrier for AI Agents is more important.

However, in addition to collaborative office, there are many other carriers that may be more suitable for the landing application of AI Agents.

For example, intelligent customer service, intelligent assistants, RPA, CRM, and so on. Specifically, in terms of intelligent customer service, AI Agents can automatically answer user questions, handle complaints and suggestions, and improve customer satisfaction and efficiency. In terms of intelligent assistants, Apple's Siri, Google's Google Assistant, and Amazon's Alexa are representatives of intelligent assistants.

In the field of intelligent process automation, many enterprises use intelligent process automation tools such as UiPath and Blue Prism to automate specific business processes.

In intelligent marketing, many marketing platforms have integrated AI Agents, such as HubSpot and Salesforce. The AI Agents on these platforms can provide precise marketing advice and predictions through data analysis and machine learning technology, helping enterprises better understand customer needs and improve sales performance.

In summary, model capability is the core, industry experience is the key, and the carrier is the guarantee. Whether it is model capability, industry experience, or carrier, they are all key to the landing of AI Agents. It is worth noting that the domestic software industry has forced domestic manufacturers to develop customized and personalized capabilities, which indirectly verifies the potential of domestic enterprises in technology implementation, further promoting the landing of Agents.

What is the ultimate goal of AI Agents?

In the "Westworld simulation" at the beginning of the article, these intelligent entities can communicate with others and the environment (notice each other's actions, initiate conversations or greetings), reflect on these observations (form unique personal viewpoints), and make daily plans. They have their own memories and goals, generate credible personal and emergent social behaviors, rather than being achieved through pre-design.

For example, starting from a single task specified by the user, such as organizing a Valentine's Day gathering, the AI Agents will spontaneously spread invitations, meet new friends, schedule dates to attend the gathering, and coordinate to appear at the gathering at the right time.

This is a representative application in the Agent project. People are surprised by this project because the interaction of the Agents has shown unexpected phenomena beyond human expectations. During the outbreak of AI Agents, people generally believe that AI Agents, which complement the shortcomings of large models, are more practical and will be an important direction for the landing of large models.

As Agent construction becomes simpler, the maturity of the Agent ecosystem will lead to a situation where C-end Agents will flourish, and in facing users, Agents will be more grounded, sparking a new round of outbreaks.

However, as it stands, there are many commercialization issues with this path. In the gaming scenario, the current revenue mainly comes from selling game equipment, skins, and other methods. The value of AI Agents cannot be realized through these traditional monetization methods. And based on the current landing effect of Agents, there is no disruptive capability, and it is unknown whether C-end users will pay for it.

More importantly, as C-end Agents flourish, their application value will tend to be infinitely smaller due to marginal effects. In other words, whether AI Agents can become the most core application direction for the commercialization outbreak of large AI models still needs time to verify. And even if it becomes the most core application direction for the commercialization outbreak of C-end, its "lifespan" is not long.

One fact is that the ultimate landing point for AI Agents may be in the B-end.

Bill Gates believes that intelligent entities as the next platform will affect the way people use software and the way software is written. They are better at finding information and summarizing it for users, can find the best prices for users, will replace search engines and e-commerce websites, and will also replace word processors, spreadsheets, and other productivity applications. And now, independent search advertising, social network advertising, shopping, productivity software, etc., will all become part of the business of intelligent entities. Agents will fundamentally change the way applications are opened.

Before these changes come, compared to the impact brought by the Agents themselves, how to build an Agent is a more important issue.

On the Agent construction platform, enterprises may be able to build their own RPA, CRM, office OA, and other management software; software vendors can also provide services to enterprises based on this platform.

For players in or preparing to enter the field of AI Agents, finding a point of entry and a good business model is crucial.

The future development of AI Agents will not be limited to individual intelligence, but will expand to the intelligence of things and the linkage of robots.

From the perspective of collective intelligence, ToC may form a larger community-based virtual organization, where everyone's Agent can be connected through virtual data; while ToB may form virtual organizations and enterprises, where different companies and employees can be integrated into the network through intelligent entities.

Ultimately, the entire society will become a huge network of virtual and real combination, forming an "intelligent network." In this network, different intelligent entities will provide greater productivity, reshaping the entire production relationship and enhancing the overall social productivity.

Therefore, the future prospects for the development of AI Agents are very broad. They will continue to expand their scope of application and influence, bringing about tremendous changes and opportunities for the future development of society.

To this day, although AI Agents have brought many imaginative ideas, there are still many doubts. The path of technological development is full of questioning and criticism, and technological change is an opportunity for any enterprise and individual. The key is how to grasp it.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

注册返10%、领$600,前100名赠送PRO会员
链接:https://accounts.suitechsui.blue/zh-CN/register?ref=FRV6ZPAF&return_to=aHR0cHM6Ly93d3cuc3VpdGVjaHN1aS5hY2FkZW15L3poLUNOL2pvaW4_cmVmPUZSVjZaUEFG
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink