AI+Web3 may become a major breakthrough in the integration and innovation of future industries.
How to view the combination of AI+Web3 data? What are the directions worth paying attention to?
Hashkey Capital-Harper: I think the combination of AI and web3 data has several points: First, using the LLM model to convert language into SQL, such as Dune, and some projects specifically focus on search engine optimization to correctly retrieve data from databases. Another aspect is the use of natural language to generate SQL, allowing developers to use it after copying. Second, the transformation of chat dialogue based on chatgpt into a chat agent, mainly for providing chat windows, less emphasis on SQL and search optimization, and more casual, such as asking which KOL has made a recommendation, or the impact of a security event on a token. Third, creating suitable models through AI to organize and extract insights from off-chain and on-chain data.
In comparison, the first requires the project to have stronger database construction capabilities because processing Web3 data is very difficult, and achieving accuracy and speed is still very challenging. The second is a relatively simple way of combining, with a not too high threshold.
SevenX Ventures - Yuxing: In fact, data is the nourishment of AI. Web3's data is open and verifiable, while AI's problem lies in its black box nature, making it difficult to verify. The combination of these two can produce some interesting chemical reactions. Currently, I am more inclined to divide the combination of AI and Web3 into two categories, not simply AI plus Web3 data, but considering how AI can make Web3 better, and how Web3 can make AI better.
First, AI can effectively utilize the open and verifiable nature of Web3 data. Any AI can use Web3 data to extract and generate value, whether it's investment advice or warning analysis. AI can help increase the efficiency of Web3 data processing and analysis. On the other hand, Web3 can increase the credibility of AI because Web3 itself is a new trust mechanism. Through the open and verifiable nature of Web3 data, transparency of AI can be increased, and in important areas such as news reporting or documentation, critical information can be stored in a Web3 manner to avoid some of the problems with AI.
One of the more common problems is AI's falsification and black box issues. Some AI algorithms may be relatively easy to understand, but others are difficult to explain, such as complex algorithms like neural networks and GPT. If the data used by AI models is verifiable, we can more easily identify sample biases. By using Web3 data, the training source and results of the entire AI model will be clearer. This way, we can more fairly assess AI, understand its decision sources, and reduce biases and errors.
The black box problem can be roughly divided into two parts. One part is the black box of the model algorithm itself, including how the model is trained and how content is generated, which is opaque or unexplainable at the training process or algorithm mechanism level. The other part is the black box of the data, where non-public data and problems with the training set can lead to result biases.
If this bias is related to content accuracy, we can continue to improve it, but if it involves ideological issues, especially political or racial discrimination, it may not be easy to correct. At this point, we can only control the data output, such as in many national or state-owned AI models, the most important thing is to control their output, which is the most difficult to do, and to some extent is similar to the bias in ideological deviation.
Qiming Venture Partners - Tang Yi: Regarding the combination of AI and Web3 data, I personally think that AI may be somewhat hyped in this field, with more emphasis on publicity than actual utility. This is because, from my perspective, data products in the Crypto space are still in a relatively early stage, and the foundational work in data is not yet solid enough. In this situation, introducing AI prematurely or excessive data analysis may be premature.
Furthermore, from the user's perspective, most scenarios where encrypted projects are combined with AI are not very applicable, or AI is not very useful. This wave of popular AI models, especially generative models, is based on a large amount of internet data, such as language processing and image generation capabilities. Although some people use generative AI to improve user experience and provide better interaction and conversation, this may have limited value for most scenarios. I believe that for more general AI (data analysis capabilities or simpler AI models), there may still be some scenarios, such as estimating prices for NFTs based on data, or professional trading teams using data to execute some trading operations. Overall, for this current wave of AI, I have not yet seen opportunities that can bring particularly short-term benefits to the cryptocurrency industry.
Of course, I have also seen some early projects that are trying to improve data processing or analysis capabilities through AI. For example, I have seen early projects using AI capabilities to explain the logic of smart contracts or perform classification and recognition work. These tasks require high accuracy in the field of smart contracts and cryptocurrencies, as they involve critical actions such as transactions. Therefore, I can imagine that using some AI capabilities for data preprocessing may be meaningful, but ultimately human intervention may still be needed to ensure accuracy. If you want to trigger trades directly through AI capabilities, in addition to professional traders, I believe there still needs to be significant progress on the product side.
Matrix Partners - Zixi: We have observed many Web3 data projects, such as our investment in Footprint, which I was initially a loyal user of, as well as Dune. I think Footprint and Dune mainly serve VCs, developers, and some small businesses, but their connection with ordinary users is not significant.
We have also looked at some data analysis companies directly related to cryptocurrency trading or profit, such as Nanson, defilama, token terminal, dappradar, and of course, Dune and Footprint. These companies are very useful for VCs and developers, but their profitability seems limited. This is because the overall demand for this data from VCs and developers is not yet large enough, and their willingness to pay is not strong, as even if some services are not free, there are always other companies providing similar free services.
We have also looked at some companies similar to data cloud warehouses. We also co-led an investment with Tencent in Chainbase. They are actually a data platform, providing security, transaction, NFT, DeFi, gaming, social, and some comprehensive data. Developers can combine these data on these platforms to generate their own APIs.
In the bear market, we noticed that many customers of companies like Chainbase, Blocksec, and Footprint are actually small and medium-sized startups. For example, some of Chainbase's large clients' revenue did not decrease, but the revenue from small and medium-sized clients dropped to zero after two or three months. This indicates that these projects cannot continue due to lack of funds.
Therefore, for data providers, it is difficult to make money in the bear market without new developers joining. This also reflects that currently in the Web3 field, data providers mainly rely on developers and small businesses that believe the data is useful, then internally integrate this data, and then monetize it, balancing income and output.
In essence, we still feel that the profit model of both ToC and ToB Web3 data companies is not very clear at the moment, which is the biggest drawback we see in the Web3 data industry, especially for small and medium-sized entrepreneurs.
Then back to the topic of the combination of AI and Web data. We have recently looked at and invested in some AI-related data companies. I think AI data companies actually face the same problem, which is the sale of data. You need to consider the balance between the customer's cost and their output. At the moment, I am relatively optimistic about the profit prospects of AI data companies, but this is mainly limited to overseas markets.
If only targeting the domestic market, I am concerned that the eventual result may be similar to investing in Web2 SaaS companies, where there may be some revenue, but the business scale will not be too large, and the willingness of customers to pay is not very strong. You may also need to provide customized services, so your gross profit margin will not be very high. So I am more pessimistic about doing this in China and more optimistic about doing this overseas.
What value do you think AI can bring to Web3 data infrastructure and Web3 data companies? How effective is the current use of AI to help Web3 data projects? Can there be some innovation in the business model?
SevenX Ventures: I believe the biggest help AI can provide to Web3 data is in terms of efficiency. For example, Dune has released a tool using AI models to detect code anomalies and information indexing, allowing users to query the corresponding data using natural language, and the code will be generated accordingly, and then code optimization can be done, which is an improvement in efficiency.
There are also projects using AI for security warnings, where AI, after appropriate training, can quickly identify security issues as an AI Robot. For example, there is an algorithm in AI called anomaly detection, which is better at detecting an outlier than directly looking at the data distribution from a pure mathematical statistical method, so this kind of AI can more effectively monitor security.
I have also seen some projects using AI algorithms, such as large language models, to retrieve news data from the entire Web3 (not just on-chain data), perform information aggregation and sentiment analysis, and form an AI Agent. For example, users can directly use the dialogue box to check the recent 30 or 90 days of network sentiment for a certain token, whether users are more inclined to be bullish or bearish, and give corresponding scores to reflect the sentiment. It will also have a curve, through which users can judge whether a token is at its peak of discussion, at a peak decline, or on the rise. These can assist users in investment, which I think is a very interesting application.
However, some other projects claim that their data is an AI data source, which I think is a bit far-fetched, because any on-chain data can be an AI data source, as it is public, so there is some suspicion of riding the hot trend.
Matrix Partners - Zixi: The business model is a big problem in the data field right now, and it's difficult to find a solution. Perhaps on the ToC side, using some Web3 concepts, such as tokens or distributed concepts, can allow AI data to adopt different business models. But if AI technology empowers data, there are not many bright spots at the moment.
AI may have an auxiliary role in data processing and cleaning, but this is more of an internal help, such as improving functionality or user experience in the product development process. But from a business perspective, there is not much change.
AI bots can indeed add some competitiveness and assist users, but at the moment, this is not a major advantage, and the core competitiveness still depends on the quality of the data source. If the data source is sufficient, I can get the information I need. The problem is, if this data is to be commercialized, then what I combine must be monetizable for me to be willing to pay for the data. The current problem is that the market is not good, startups do not know how to monetize data, and there are not enough new startups entering the market.
I think what is interesting now is actually some Web2 companies that are using Web3 technology. For example, a synthetic data company that uses large models to generate synthetic data for use in software testing, data analysis, and AI model training. They encounter many privacy deployment issues when handling data, and they use the Oasis blockchain to effectively avoid data privacy issues. They also want to create a data exchange, packaging synthetic data in NFTs for trading, solving ownership and privacy issues. I think this is a good idea, using Web3 technology to assist Web2 in solving problems, not necessarily limited to Web3 companies. However, the market for synthetic data is not yet large enough, and there are risks in investing in such companies in the early stages. If the downstream market does not develop, or if there are too many competitors, the situation will be awkward.
In the field of AI+Web3 data, have you invested in some good projects, what are the key factors that determine your investment in them? What do you think is the core competitiveness of such projects? Will AI enhance this competitiveness?
Hashkey Capital-Harper: We invested in data projects quite early, basically before they emphasized AI, such as space and time, 0xscope, mind network, zettablock, etc. The key to our investment is to see their positioning and data quality. These projects now have AI plans, and they basically start with chat agents. Space and time and ChainML have collaborated to launch the infrastructure for creating AI agents, and the DeFi agent created is used for space and time, which is also a way of combining AI.
SevenX Ventures - Yuxing: If a project integrates well with AI, then I might be more interested in it. One of the key factors that determines whether I will invest is whether the project has market barriers. I have observed many projects claiming that their integration with AI can improve efficiency, such as fast data query functions. Some projects can quickly retrieve on-chain NFT data through natural language queries, such as querying the top ten most active NFTs in recent transactions. Such projects may have a first-mover advantage, but the market barriers may not be very strong.
The real barrier is the application of AI itself and how engineers apply AI to specific scenarios. If engineers can skillfully fine-tune models, they can usually achieve good results. For projects that improve efficiency, the market barrier mainly lies in the data source. It includes not only on-chain data but also how the project handles and parses this data. For example, the aforementioned projects can quickly retrieve important data through AI algorithms. However, the effect of engineers fine-tuning models is limited, and the real sustained advantage lies in the quality of the data source and its ability for continuous optimization. This is also why some data analysis companies can stand out in the market. They not only provide data sources but also data processing and analysis capabilities, and the difference often lies in the team's technical capabilities and talent. These factors directly relate to the final effect of AI integration applications.
In addition, I also pay attention to Web3 technology projects that can make AI better, because the AI market is very large. If Web3 technology can enhance the capabilities of AI, then the application scenarios will be very extensive. This is why the ZKML project is so popular. However, I have noticed that Web3 projects are often prone to exaggerating or undervaluing their value. Projects like ZKML, despite being highly anticipated, do not bring returns as quickly as expected, and the exit mechanism is not clear because issuing tokens is difficult. Therefore, although these projects are creative and have potential value, whether it is worth investing in them now and how much return they will ultimately bring are things that investors need to carefully consider.
Matrix Partners - Zixi: We have invested in a company that combines AI and Web3, which is a data labeling company called Questlab. They use blockchain technology to provide crowdsourced data labeling services. Data labeling was originally a direct-operated or subcontracted industry, and it was difficult to achieve full coverage in the knowledge domain.
In traditional data labeling, there are generally three types: direct operation, subcontracting, and crowdsourcing. However, there are relatively few people doing crowdsourcing. Companies using these three modes need to consider factors such as whether the price is cheap, the quality of labeling, and efficiency. Another factor is whether they can cover their industry. If you are only doing some general model language or image labeling, it's quite simple, just recognizing English words or images. A bit more difficult, such as distinguishing between cats, dogs, moons, strollers, etc., is not very difficult. But if you need to do more professional labeling, such as the labeling needed by the speech robot community, it becomes much more complex. They may need to label various dialects and multiple languages, including Chinese dialects, English dialects, and languages from various niche regions, and few traditional studios are willing to do this kind of work.
A more complex example is a legal plus AI company, which needs to label a large amount of legal knowledge to train various models. It is very difficult to find people who are knowledgeable in law and can do professional labeling, as they need to understand the laws of various countries and various specialized legal fields such as contract law, tenancy law, civil law, criminal law, etc. There is almost no data labeling company in the market that can provide such professional services. Law is professional, and so are finance, biology, medicine, education, and so on. Therefore, labeling work in these fields is generally only done by internal teams, using a crowdsourcing method to solve the problem of professional knowledge coverage.
We believe that using blockchain for crowdsourcing is a good direction, just like what YGG is doing in the Gamefi field. This is what we think is a promising direction.
In addition, we believe that there will be some good opportunities in the open-source model community. For example, a project invested in by Polychain is similar to a web3 version of Hugging Face, aimed at solving the problem of model content creator economics.
As for other combinations of AI and Web3, I think it is feasible to combine some token gameplay in the ToC direction to increase the stickiness, daily activity, and emotions of the entire community. This also facilitates investors to monetize, but the market size is also uncertain. These are some of my views on AI and Web3. I think if it's a pure ToB business, there is no need to use Web3, and it's fine to do it in a Web2 way.
Qiming Venture Partners - Tang Yi: Currently, some of the data projects we have invested in are working with on-chain data in security scenarios. I think some basic pattern recognition or feature discovery work involving AI is effective, but more advanced work, such as inputting a large amount of activity data into models and identifying multiple pieces of information, is still in the experimental stage and the effectiveness needs to be verified. Apart from the security field, many other fields also have similar situations.
A recent example is NFTGo, a project we invested in, which uses big data analysis to price NFTs, with a certain level of accuracy, and plans to use it for price Oracles and other purposes. Although this system sounds interesting, it still needs to be verified in terms of product and user acceptance. Because even if it may currently achieve 90 or 85 percent accuracy, users may require a higher level, such as 98 or 95 percent, so further verification is needed. Therefore, while some projects are applying simple AI capabilities such as data analysis and pattern recognition to products, whether they become key factors has yet to be verified.
As for investment willingness, personally, I would not be more inclined to invest in a project just because it has some AI gimmicks, because I think the actual effect and whether the project can achieve its goals and bring benefits are more important. If a project is just using AI as a marketing tool to attract more attention or exposure, I can understand. But in investment decisions, I think the actual effect is more important.
Projects like ZKML seem to be highly anticipated, but at the same time, there are also major issues, such as what scenarios it is actually used for. I think there is currently a lot of uncertainty, and it is more of a grand narrative.
In terms of the overall industry development, what are the potential opportunities or directions for the future of the AI + Web3 data track? In the future, is it possible for AI to completely upgrade data products and introduce new concepts? Will it enhance users' willingness to pay?
Hashkey Capital-Harper: There are definitely potential opportunities. The future development direction is actually lagging behind the AI of web2, where creativity is obviously stronger, and the AI in web3 is probably just a mapping of web2 AI.
Matrix Partners - Zixi: I think the recent popularity of Miao Ya Camera has made people realize that there is still a willingness to pay for AI products, unlike traditional SaaS products or games, where people expect to use them for free. Users actually have a strong willingness to pay for AI.
In the future, I can offer some ideas. In our data labeling process, there is a key step called pre-labeling, where we train a model to perform preliminary labeling. This step is very valuable and can save a lot of labor costs. We input the original data into the pre-trained model for pre-labeling, and then carry out semi-automated data processing, finally manually performing precise labeling. Pre-labeling can significantly improve efficiency, possibly reducing the need from 100 people's work to only 50 to 70 people.
In addition, pre-labeling also involves the collaboration of AI and humans, and through your feedback, the model's pre-labeling ability can be continuously improved, reducing the need for the data labeling team. As AI and human collaboration becomes better, a team that originally required 100 people may only need 30 people. However, there is a limit to this process, and even if AI collaboration is very good, a certain number of manual labeling and review is still required.
In other fields, as I am not a data scientist and have not personally cleaned data or used data for SQL queries, I am not sure how much help AI can provide in these areas.
Qiming Venture Partners - Tang Yi: I think in the long term, there should be some intersection between Web3 and AI. For example, from an ideological perspective, the value system of Web3 can be combined with AI, and is very suitable as a bot account system or value conversion system. Imagine a robot having its own account, being able to earn money through its intelligent part, and paying for the maintenance of its underlying computing power, and so on. These concepts are a bit science fiction, and actual application may still have a long way to go.
The second possible direction is to verify whether the output of AI models is based on specific categories or specific models, or specific data, and whether it is trustworthy. These areas may have some use in trustworthy AI models. From a technical perspective, these are very interesting, but whether there is enough market demand is still uncertain.
Another aspect is that the emergence of AI has made data content generation ubiquitous and cheap. For digital works and other content, it is difficult to determine their quality and creators. In this regard, the rightful ownership of data content may require a completely new system, including the roles of creators and intelligent entities. But overall, these issues may still need to be resolved, and narrative content may need more time to develop. In the short term, we should continue to focus on the quality of the underlying data and hope that the models will become more powerful.
In addition, in terms of commercialization, it is indeed very difficult to commercialize data products. But I think from a business perspective, AI may not be the solution to the commercialization of data products in the short term. Commercialization requires more productization efforts, not just data capabilities. Therefore, these projects may need to develop other products to achieve commercialization.
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。