AI x Crypto: From Zero to the Peak

Introduction

The recent development of the AI industry is regarded by some as the fourth industrial revolution, with the emergence of large models significantly improving the efficiency of various industries. According to Boston Consulting Group, GPT has increased the work efficiency in the United States by approximately 20%. The generalization ability brought by large models is hailed as a new paradigm for software design. In the past, software design was precise coding, but now it involves embedding more generalized large model frameworks into software, enabling the software to perform better and support a wider range of input and output modes. Deep learning technology has indeed brought about the fourth prosperity of the AI industry, and this trend has also permeated into the crypto industry.

AI x Crypto: From Zero to Peak

Adoption rates of GPT in various industries, Source: Bain AI Survey

In this report, we will delve into the development history of the AI industry, its technological classifications, and the impact of deep learning technology invention on the industry. We will then analyze in depth the upstream and downstream of the industrial chain of deep learning, including GPU, cloud computing, data sources, edge devices, and their current status and trends. Subsequently, we will fundamentally discuss the relationship between the crypto and AI industries, and sort out the pattern of the AI industrial chain related to crypto.

Development History of the AI Industry

The AI industry started in the 1950s, and to achieve the vision of artificial intelligence, the academic and industrial sectors, with different disciplinary backgrounds at different times, developed many schools of thought to achieve artificial intelligence.

AI x Crypto: From Zero to Peak

Comparison of AI schools of thought, Image Source: Gate Ventures

AI x Crypto: From Zero to Peak

AI/ML/DL relationship, Image Source: Microsoft

The main technology used in modern artificial intelligence is "machine learning," which aims to improve the performance of systems by allowing machines to iterate repeatedly on tasks based on data. The main steps are to feed data into the algorithm, train the model using this data, test and deploy the model, and use the model to complete automated prediction tasks.

Currently, machine learning has three main schools of thought, namely connectionism, symbolic, and behaviorism, which respectively mimic the human nervous system, thinking, and behavior.

AI x Crypto: From Zero to Peak

Neural network architecture diagram, Image Source: Cloudflare

Currently, connectionism represented by neural networks (also known as deep learning) is dominant. The main reason is that this architecture has an input layer, an output layer, and multiple hidden layers. Once the number of layers and neurons (parameters) becomes large enough, there is enough opportunity to fit complex general tasks. By inputting data, the parameters of neurons can be continuously adjusted, and after undergoing multiple iterations of data, the neurons will reach an optimal state (parameters). This is also the origin of the word "deep" – a sufficient number of layers and neurons.

For example, it can be understood as constructing a function. When we input X=2, Y=3; X=3, Y=5, if we want this function to handle all X values, we need to continuously add degrees and parameters to this function. For instance, we can construct a function that satisfies these conditions as Y = 2X - 1. However, if there is a data point where X=2, Y=11, we need to reconstruct a function suitable for these three data points. Using GPU for brute force, we find that Y = X^2 - 3X + 5 is more suitable, but it does not need to completely overlap with the data, it just needs to maintain balance and have roughly similar outputs. Here, X^2 and X, X^0 represent different neurons, and 1, -3, 5 are their parameters.

When we input a large amount of data into the neural network, we can increase neurons and iterate parameters to fit new data. This allows us to fit all the data.

AI x Crypto: From Zero to Peak

Evolution of deep learning technology, Image Source: Gate Ventures

Based on neural network, deep learning technology has undergone multiple iterations and evolutions, from the earliest neural networks to feedforward neural networks, RNN, CNN, GAN, and finally evolved to modern large models such as GPT using Transformer technology. The Transformer technology adds a transformer to encode data of all modes (such as audio, video, images) into corresponding numerical representations. Then it is input into the neural network, allowing the neural network to fit any type of data, achieving multimodal capabilities.

The development of AI has gone through three technological waves. The first wave was in the 1960s, a decade after the proposal of AI technology. This wave was caused by the development of symbolic technology, which solved the problems of general natural language processing and human-computer dialogue. During the same period, the expert system was born. Under the supervision of NASA, Stanford University completed the expert system DENRAL, which had very strong chemical knowledge and could generate answers similar to those of a chemical expert through inference from questions. This expert system can be seen as a combination of a chemical knowledge base and an inference system.

After the expert system, in the 1990s, American scientist and philosopher of Israeli descent Judea Pearl proposed Bayesian networks, also known as belief networks. At the same time, Brooks proposed behavior-based robotics, marking the birth of behaviorism.

In 1997, IBM's Deep Blue defeated chess champion Kasparov with a score of 3.5:2.5, which was seen as a milestone in artificial intelligence, marking the second peak of AI technology development.

The third wave of AI technology occurred in 2006. The three giants of deep learning, Yann LeCun, Geoffrey Hinton, and Yoshua Bengio, proposed the concept of deep learning, an algorithm that uses artificial neural networks for data representation learning. Subsequently, the algorithms of deep learning gradually evolved, from RNN, GAN to Transformer and Stable Diffusion. These two algorithms jointly shaped this third wave of technology, and this was also the heyday of connectionism.

Many landmark events have emerged gradually with the exploration and evolution of deep learning technology, including:

In 2011, IBM's Watson won the "Jeopardy" quiz show, defeating humans and becoming the champion.
In 2014, Goodfellow proposed GAN (Generative Adversarial Network), which can generate lifelike photos by allowing two neural networks to learn from each other through competition. Goodfellow also wrote a book "Deep Learning," known as the "Bible of Deep Learning," which is one of the important introductory books in the field of deep learning.
In 2015, Hinton and others proposed a deep learning algorithm in the journal "Nature," which immediately caused a huge response in the academic and industrial circles.
In 2015, OpenAI was founded, with Musk, YC President Altman, and angel investor Peter Thiel jointly investing $1 billion.
In 2016, AlphaGo, based on deep learning technology, competed against Go world champion and professional nine-dan Go player Lee Sedol, winning with a total score of 4-1.
In 2017, Sophia, a humanoid robot developed by Hanson Robotics in Hong Kong, was declared the first robot to be granted citizenship, with rich facial expressions and human language understanding capabilities.
In 2017, Google, with abundant talent and technical reserves in the field of artificial intelligence, published the paper "Attention is all you need," proposing the Transformer algorithm, and large-scale language models began to emerge.
In 2018, OpenAI released GPT (Generative Pre-trained Transformer) based on the Transformer algorithm, which was one of the largest language models at the time.
In 2018, Google's Deepmind team released AlphaGo based on deep learning, capable of predicting protein structures, seen as a significant milestone in the field of artificial intelligence.
In 2019, OpenAI released GPT-2, which had 1.5 billion parameters.
In 2020, OpenAI developed GPT-3, with 175 billion parameters, 100 times larger than the previous version GPT-2. This model was trained using 570GB of text and achieved state-of-the-art performance in multiple NLP (natural language processing) tasks such as answering questions, translation, and writing articles.
In 2021, OpenAI released GPT-4, which has 1.76 trillion parameters, ten times that of GPT-3.
In January 2023, the ChatGPT application based on the GPT-4 model was launched, and by March, ChatGPT reached 100 million users, becoming the fastest application to reach 100 million users in history.
In 2024, OpenAI launched GPT-4 omni.

Note: Due to the large number of AI papers, diverse schools of thought, and varied technological evolution, this report mainly follows the development history of deep learning or connectionism. Other schools of thought and technologies are still in a rapid development process.

Deep Learning Industry Chain

Currently, the large language models (LLMs) all use deep learning methods based on neural networks. Large models led by GPT have created a wave of artificial intelligence, attracting a large number of players to this track. We have also seen a surge in market demand for data and computing power. Therefore, in this part of the report, we mainly explore the industrial chain of deep learning algorithms. In the AI industry dominated by deep learning algorithms, we examine how the upstream and downstream components are composed, as well as the current status, supply-demand relationship, and future development.

AI x Crypto: From Zero to Peak

GPT training Pipeline, Image Source: WaytoAI

First, it is important to clarify that the training of large models based on Transformer technology, led by GPT, is divided into three steps.

Before training, because it is based on Transformer, the transformer needs to convert the input text into numerical values, a process called "Tokenization." These numerical values are then referred to as Tokens. As a general rule of thumb, one English word or character can be roughly considered as one Token, and each Chinese character can be roughly considered as two Tokens. This is the basic unit used in GPT pricing.

The first step is pre-training. By providing a sufficient amount of input data pairs, similar to the example given in the first part of the report (X, Y), the model's neurons' optimal parameters are sought. This process requires a large amount of data and is the most computationally intensive, as it involves iterating the neurons to try various parameters. After training with a batch of data pairs, the same batch of data is usually used for a second round of training to iterate the parameters.

The second step is fine-tuning. Fine-tuning involves training with a smaller batch of data, but of very high quality. This change improves the quality of the model's output, as pre-training requires a large amount of data, but much of the data may contain errors or be of low quality. Fine-tuning with high-quality data can improve the model's quality.

The third step is reinforcement learning. First, a brand new model, called the "reward model," is established. The purpose of this model is very simple – to rank the model's outputs. Therefore, building this model is relatively simple because the business scenario is quite vertical. This model is then used to determine whether the output of our large model is of high quality, allowing an automatic iteration of the large model's parameters using a reward model. (However, sometimes human intervention is also needed to assess the quality of the model's output.)

In summary, during the training of large models, pre-training has a very high requirement for the amount of data, requiring the most GPU computing power. Fine-tuning requires higher-quality data to improve the parameters, and reinforcement learning can use a reward model to iteratively adjust the parameters to produce higher-quality results.

During the training process, the more parameters there are, the higher the ceiling of its generalization ability. For example, in the function example mentioned earlier, Y = aX + b, there are two neurons X and X0, so the more the parameters change, the more data the model can fit, which is why large models achieve remarkable results. This is also why they are popularly referred to as large models – they consist of a huge number of neurons and parameters, a massive amount of data, and require a massive amount of computing power.

Therefore, the performance of large models is mainly determined by three factors: the number of parameters, the amount and quality of data, and computing power. Assuming the number of parameters is p and the amount of data is n (calculated in terms of the number of Tokens), we can estimate the required computing power based on a general rule of thumb, which allows us to roughly estimate the computing power needed and the training time.

Computing power is generally measured in Flops, representing a floating-point operation, which is a collective term for non-integer numerical addition, subtraction, multiplication, and division, such as 2.5 + 3.557. Floating-point indicates the ability to carry decimal points, and FP16 represents support for decimal precision, while FP32 is the more common precision. Based on practical experience, pre-training a large model once (usually trained multiple times) requires approximately 6np Flops, where 6 is an industry constant. Inference (the process of inputting data and waiting for the output of the large model) is divided into two parts – inputting n Tokens and outputting n Tokens – requiring approximately 2np Flops in total.

In the early days, CPU chips were used to provide computing power for training, but later on, GPUs gradually replaced them, such as Nvidia's A100 and H100 chips. This is because CPUs are designed for general-purpose computing, while GPUs can be used for specialized computing, far surpassing CPUs in energy efficiency. GPU floating-point operations are mainly performed through a module called Tensor Core. Therefore, the Flops data for chips with FP16/FP32 precision represents their main computing power and is one of the main metrics for chips.

AI x Crypto: From Zero to Peak

Nvidia A100 Chip Specifications, Source: Nvidia

Readers should be able to understand the chip introductions of these companies. As shown in the comparison between Nvidia's A100 80GB PCIe and SXM models in the above image, it can be seen that under the Tensor Core (a module specifically designed for AI computing), the FP16 precision yields 312TFLOPS and 624TFLOPS for PCIe and SXM, respectively.

Taking GPT-3 as an example, with 175 billion parameters and a data volume of 180 billion Tokens (approximately 570GB), one round of pre-training requires 6np Flops, approximately 3.1510^22 Flops. If measured in TFLOPS (Trillion FLOPs), it is approximately 3.1510^10 TFLOPS. This means that one SXM model chip would require approximately 50,480,769 seconds, 841,346 minutes, 14,022 hours, or 584 days to pre-train GPT-3.

We can see the enormous amount of computing power required, necessitating multiple state-of-the-art chips to work together to achieve one round of pre-training. Furthermore, with GPT-4 having ten times the parameters of GPT-3 (1.76 trillion), it means that even with the same data volume, the number of chips needed to be purchased would increase tenfold. Additionally, GPT-4 has 13 trillion Tokens, which is ten times that of GPT-3. Ultimately, GPT-4 may require over 100 times the chip computing power.

In large model training, data storage is also a concern. For example, GPT-3 with 180 billion Tokens occupies approximately 570GB of storage space, while a neural network with 175 billion parameters in a large model occupies approximately 700GB of storage space. GPU memory space is generally limited (e.g., the A100 model mentioned above has 80GB), so when the memory space cannot accommodate this data, the chip's bandwidth needs to be considered – that is, the speed of data transfer from the hard drive to memory. Additionally, since multiple chips are used, the rate of data transfer between chips in a multi-GPU training scenario also becomes a factor. Therefore, in many cases, the limiting factor or cost in the final model training practice may not necessarily be the computing power of the chip, but more likely the chip's bandwidth. Slow data transfer leads to longer model runtime and increased electricity costs.

AI x Crypto: From Zero to Peak

H100 SXM Chip Specifications, Source: Nvidia

At this point, readers should be able to fully understand the chip specifications. FP16 represents precision, and since Tensor Core components are mainly used for training AI LLMs, it is only necessary to consider the computing power under this component. The FP64 Tensor Core represents that the H100 SXM can process 67 TFLOPS per second at 64 precision. GPU memory indicates that the chip's memory is only 64GB, completely insufficient to meet the data storage requirements of large models. Therefore, GPU memory bandwidth represents the speed of data transfer, which is 3.35TB/s for the H100 SXM.

AI x Crypto: From Zero to Peak

AI Value Chain, Image Source: Nasdaq

We can see that the expansion of data and the number of neural network parameters have led to a significant demand for computing power and storage. These three main factors have incubated an entire industry chain. We will introduce the role and function of each part in the industry chain based on the above image.

Hardware GPU Providers

AI x Crypto: From Zero to Peak

AI GPU Chip Rankings, Source: Lambda

Hardware such as GPUs are the main chips used for training and inference. Currently, Nvidia is in an absolute leading position among the main GPU chip designers. In the academic field (mainly universities and research institutions), consumer-grade GPUs (such as RTX, primarily used for gaming) are mainly used. In the industrial field, commercialized deployments for large models mainly use H100, A100, and other models.

In the rankings, Nvidia's chips almost dominate the list, with all chips coming from Nvidia. Google also has its own AI chip called TPU, but TPUs are mainly used by Google Cloud to provide computing power to B2B enterprises. Enterprises that purchase their own chips generally prefer to buy Nvidia GPUs.

AI x Crypto: From Zero to Peak

H100 GPU Purchase Statistics by Company, Source: Omdia

Many enterprises have begun to develop LLMs, with over 100 large models in China alone, and over 200 large language models released globally. Many internet giants are participating in this AI boom, either purchasing large models themselves or leasing them through cloud companies. In 2023, Nvidia's state-of-the-art chip, the H100, was quickly subscribed to by multiple companies. The global demand for H100 chips far exceeds supply, as currently only Nvidia is supplying the highest-end chips, with a delivery cycle reaching an astonishing 52 weeks.

Given Nvidia's monopoly, Google, as one of the absolute leading companies in artificial intelligence, has led Intel, Qualcomm, Microsoft, and Amazon in establishing the CUDA Alliance, hoping to jointly develop GPUs to break free from Nvidia's absolute influence on the deep learning industry chain.

For super large tech companies, cloud service providers, and national-level laboratories, they often purchase thousands or even tens of thousands of H100 chips to build HPC (High-Performance Computing) centers. For example, Tesla's CoreWeave cluster purchased 10,000 H100 80GB chips at an average purchase price of $44,000 (Nvidia's cost is approximately 1/10 of this), totaling $440 million. Tencent purchased 50,000 chips, and Meta bought a whopping 150,000 chips. By the end of 2023, as the sole high-performance GPU seller, Nvidia's orders for H100 chips exceeded 500,000.

Reference

AI x Crypto: From Zero to Peak

Nvidia GPU Product Roadmap, Source: Techwire

In terms of Nvidia's chip supply, the above is its product iteration roadmap. As of the report, news about the H200 has already been released, with the performance of the H200 expected to be twice that of the H100, and the B100 is expected to be released at the end of 2024 or the beginning of 2025. Currently, GPU development still follows Moore's Law, with performance doubling every two years and prices halving.

Cloud Service Providers

AI x Crypto: From Zero to Peak

Types of GPU Cloud, Source: Salesforce Ventures

After purchasing enough GPUs to build HPC, cloud service providers can provide flexible computing power and hosted training solutions for AI enterprises with limited funds. As shown in the image above, the market is currently mainly divided into three types of cloud computing providers. The first type is large-scale cloud computing platforms represented by traditional cloud providers (AWS, Google, Azure). The second type is vertical cloud computing platforms, mainly deployed for AI or high-performance computing, providing more specialized services. Therefore, there is still some market space for competition with the giants. These emerging vertical industry cloud service companies include CoreWeave (raised $1.1 billion in Series C funding, valued at $19 billion), Crusoe, Lambda (raised $260 million in Series C funding, valued at over $15 billion), and others. The third type of cloud service provider is a new market player, mainly providing inference as a service. These providers rent GPUs from cloud service providers and deploy pre-trained models for customers to fine-tune or infer on. Representative companies in this market include Together.ai (latest valuation of $1.25 billion) and Fireworks.ai (led by Benchmark, raised $25 million in Series A funding).

Training Data Source Providers

As mentioned in the previous section, large model training mainly goes through three steps: pre-training, fine-tuning, and reinforcement learning. Pre-training requires a large amount of data, and fine-tuning requires high-quality data. Therefore, data companies such as Google (with a large amount of data) and Reddit (with high-quality Q&A data) are widely recognized in the market.

Some developers choose to develop in niche areas to avoid competing with general-purpose large models. Therefore, the requirement for data is that it is specific to a particular industry, such as finance, healthcare, chemistry, physics, biology, image recognition, etc. These are models tailored to specific fields, requiring specific field data. Therefore, there are companies that provide data for these large models, which can be called data labeling companies. They collect data and label it to provide higher quality and specific data types.

For model development companies, a large amount of data, high-quality data, and specific data are the three main data demands.

AI x Crypto: From Zero to Peak

Major Data Labeling Companies, Source: Venture Radar

A Microsoft study suggests that for SLMs (small language models), if their data quality is significantly better than that of LLMs, their performance may not necessarily be worse than LLMs. In fact, GPT does not have a clear advantage in originality and data, but its success is mainly due to its bold bet in this direction. Sequoia Capital also admits that GPT may not necessarily maintain a competitive advantage in the future, as there is currently no strong moat in this area, and the main limitation comes from the restriction on obtaining computing power.

In terms of data volume, according to EpochAI's prediction, given the current model size growth, all low-quality and high-quality data will be exhausted by 2030. Therefore, the industry is currently exploring synthetic data for artificial intelligence, which can generate unlimited data. The bottleneck then becomes computing power, and this area is still in the exploratory stage, worthy of developers' attention.

Database Providers

We have the data, but the data also needs to be stored, usually in a database, to facilitate data manipulation. In traditional internet businesses, we may have heard of MySQL, and in the Ethereum client Reth, we may have heard of Redis. These are local databases for storing business data or blockchain data. Different data types or businesses require different database adaptations.

For AI data and deep learning training and inference tasks, the database currently used in the industry is called a "vector database." Vector databases are designed to efficiently store, manage, and index massive high-dimensional vector data. Because our data is not just numerical or textual, but also includes a large amount of unstructured data such as images and sounds, vector databases can store and process this unstructured data in the form of "vectors."

AI x Crypto: From Zero to Peak

Vector Database Classification, Source: Yingjun Wu

The main players in this field currently include Chroma (raised $18 million in funding), Zilliz (latest round of funding of $60 million), Pinecone, Weaviate, and others. We expect that with the increasing demand for data volume and the emergence of large models and applications in various niche fields, the demand for vector databases will significantly increase. And because this field has strong technological barriers, more consideration is given to mature and customer-oriented companies when investing.

Edge Devices

When building GPU HPC clusters, a large amount of energy is usually consumed, resulting in a large amount of heat. Chips in high-temperature environments will limit their operating speed to reduce temperature, commonly known as "throttling." This requires edge devices for cooling to ensure the continuous operation of HPC.

So here we are dealing with two aspects of the industrial chain, namely energy supply (usually using electricity) and cooling systems.

Currently, on the energy supply side, electricity is mainly used, and data centers and supporting networks currently account for 2%-3% of global electricity consumption. BCG predicts that with the growth of deep learning large model parameters and chip iterations, the power consumption for training large models will triple by 2030. Currently, technology companies at home and abroad are actively investing in the energy sector, with major energy investment directions including geothermal energy, hydrogen energy, battery storage, and nuclear energy.

In terms of HPC cluster cooling, air cooling is currently predominant, but many venture capitalists are heavily investing in liquid cooling systems to maintain the stable operation of HPC. For example, Jetcool claims that its liquid cooling system can reduce the total power consumption of an H100 cluster by 15%. Currently, liquid cooling is mainly divided into three exploration directions: direct-to-chip liquid cooling, immersion liquid cooling, and spray liquid cooling. Companies in this field include Huawei, Green Revolution Cooling, SGI, and others.

Applications

The current development of AI applications, similar to the development of the blockchain industry, as an innovative industry, saw the proposal of Transformer in 2017, and it was not until 2023 that OpenAI confirmed the effectiveness of large models. So now many FOMO enterprises are crowded in the research and development track of large models, which means the infrastructure is very crowded, but application development has not kept pace.

Reference

AI x Crypto: From Zero to Peak

Currently, most of the top ten monthly active AI applications are search-type applications, and the actual emerging AI applications are very limited, with a relatively single application type, and no successful social applications, for example.

AI x Crypto: From Zero to Peak

We also find that AI applications based on large models have much lower retention rates than existing traditional internet applications. In terms of active users, the median for traditional internet software is 51%, with WhatsApp being the highest, showing strong user stickiness. However, on the AI application side, the highest DAU/MAU is only 41% for character.ai, and the median DAU accounts for 14% of the total number of users. In terms of user retention rates, the best for traditional internet software are YouTube, Instagram, TikTok, with a median retention rate of 63%, while ChatGPT has a retention rate of only 56%.

AI x Crypto: From Zero to Peak

AI Application Landscape, Source: Sequoia

According to a report from Sequoia Capital, applications are divided into three categories based on the roles they target: consumer-oriented, enterprise-oriented, and ordinary consumer-oriented.

Consumer-oriented: Generally used to improve productivity, such as text workers using GPT for Q&A, automated 3D rendering modeling, software editing, automated agents, using voice applications for voice conversations, companionship, language practice, etc.
Enterprise-oriented: Typically in marketing, legal, medical design, and other industries.

Although many people criticize that infrastructure far exceeds applications, we believe that the modern world has been widely reshaped by artificial intelligence technology, using recommendation systems, including TikTok, Today's Headlines, and Quanmin Music under ByteDance, as well as Xiaohongshu and WeChat Video Accounts, advertising recommendation technology, all of which are customized recommendations for individuals, all of which are machine learning algorithms. So the flourishing development of deep learning does not completely represent the AI industry, as many potential technologies for achieving general artificial intelligence are also developing in parallel, and some of these technologies have been widely applied in various industries.

The Relationship between Crypto and AI

Blockchain has evolved into a decentralized and trustless concept thanks to the development of ZK technology. Going back to the creation of blockchain, it started with the Bitcoin chain. In Satoshi Nakamoto's paper "Bitcoin: A Peer-to-Peer Electronic Cash System," it was first referred to as a trustless value transfer system. Later, Vitalik and others published the paper "A Next-Generation Smart Contract and Decentralized Application Platform," introducing a smart contract platform for decentralization, trustlessness, and value exchange.

In essence, we believe that the entire blockchain network is a value network, and every transaction is a value conversion based on the underlying token. The value here is reflected in the form of tokens, and tokenomics is the specific rules governing the relative value of the native tokens in the ecosystem. In traditional internet, value generation is settled based on P/E, with a final form of value, namely stock price. All traffic, value, and influence form the enterprise's cash flow, which is the ultimate manifestation of value, eventually reflected in P/E and market value.

AI x Crypto: From Zero to Peak

Network value is jointly determined by the price of native tokens and multi-dimensional perspectives, Source: Gate Ventures

However, for the Ethereum network, ETH, as the embodiment of multiple dimensions of value in the Ethereum network, not only can obtain stable cash flow through staking, but also can serve as a medium for value exchange, a medium for value storage, and a consumable for network activities. Additionally, it also serves as a security protection layer for restaking and gas fees in the Layer2 ecosystem.

Tokenomics is very important, as token economics can dictate the relative value of the settlement object (the native token of the network). Although we cannot price every dimension, we have the embodiment of multi-dimensional value, which is the token price. This value far exceeds the existence form of securities for enterprises. Once a token is endowed to the network and circulated, similar to the limited quantity of all Q coins of Tencent, with mechanisms for deflation and inflation, it represents the vast ecosystem of Tencent, serving as a settlement object and a means of value storage and interest. This value is far beyond the value of stock prices. And tokens are the ultimate embodiment of multi-dimensional value.

Tokens are very attractive, as they can give value to a function or a certain idea. We use browsers, but do not consider the underlying open-source HTTP protocol for pricing. If tokens are issued, their value will be reflected in transactions. The existence of a MEME coin and the humorous idea it represents also have value. Tokenomics can empower any innovation or existence, whether it is an idea or a physical creation, as token economics values everything in the world.

Tokens and blockchain technology, as a means of redefining and discovering value, are crucial for any industry, including the AI industry. In the AI industry, issuing tokens can reshape the value of various aspects of the AI industry chain, incentivizing more people to deeply root themselves in various segments of the AI industry because the returns will become more significant. It's not just cash flow that determines their current value, and the synergistic effect of tokens will enhance the value of the infrastructure, naturally leading to the formation of a fat protocol and thin application paradigm.

Furthermore, all projects in the AI industry chain will benefit from capital appreciation through these tokens, and these tokens can also contribute back to the ecosystem and promote the birth of certain philosophical ideas.

Clearly, tokenomics has a positive impact on the industry, and the immutable and trustless nature of blockchain technology also has practical significance for the AI industry. It can enable applications that require trust, such as allowing user data to be used in a certain model while ensuring that the model does not know the specific data, does not leak data, and ensures the return of the real data inferred by the model. When GPUs are insufficient, they can be distributed through the blockchain network, and when GPUs are iterated, idle GPUs can contribute computing power to the network, rediscovering residual value, something that only a global value network can achieve.

In summary, tokenomics can promote the reshaping and discovery of value, and decentralized ledgers can solve trust issues, allowing value to flow globally in a new way.

Overview of Value Chain Projects in the Crypto Industry

GPU Supply Side

AI x Crypto: From Zero to Peak

Some participants in the GPU cloud computing power market, Source: Gate Ventures

The above are the main participants in the GPU cloud computing power market. Render has a good market value and fundamental development, having been launched in 2020. However, due to the lack of transparency in its data, we currently cannot know its real-time business development status. Currently, the vast majority of businesses using Render are non-large model video rendering tasks.

Render, as an old-fashioned Depin business with actual business volume, has indeed succeeded in riding the wave of AI/Depin. However, the scenarios targeted by Render are different from AI, so strictly speaking, it is not considered part of the AI sector. Additionally, its video rendering business does have a certain real demand, so the GPU cloud computing power market can be used not only for AI model training and inference but also for traditional rendering tasks, reducing the market's reliance on a single market risk.

AI x Crypto: From Zero to Peak

Global GPU computing power demand trend, Source: PRECEDENCE RESEARCH

In the Crypto industry's AI value chain, computing power supply is undoubtedly the most important point. According to industry forecasts, the market demand for GPU computing power is expected to be approximately $75 billion in 2024 and around $773 billion in 2032, with a compound annual growth rate (CAGR) of approximately 33.86%.

The iteration rate of GPUs follows Moore's Law (doubling performance every 18-24 months, halving prices), so the demand for shared GPU computing power will be significant. Due to the explosion of the GPU market, under the influence of Moore's Law in the future, a large number of non-latest generation GPUs will be formed, and these idle GPUs will continue to play their value as long-tail computing power in the shared network. Therefore, we do see the long-term potential and practical utility of this track, not only for small and medium-sized model businesses but also for traditional rendering businesses.

It is worth noting that many reports highlight the low prices of these products as the main selling point, to illustrate the vast space for on-chain GPU sharing and computing markets. However, it is important to emphasize that the pricing of the cloud computing power market is not only related to the GPUs used but also to factors such as data transmission bandwidth, edge devices, and supporting AI hosting developer tools. However, under the same conditions of bandwidth and edge devices, due to the existence of token subsidies, a portion of the value is determined by tokens and network effects, resulting in lower prices. This is an advantage in terms of pricing, but at the same time, it also has the drawback of slow network data transmission, leading to slow model development and rendering tasks.

Hardware Bandwidth

AI x Crypto: From Zero to Peak

Some participants in the shared bandwidth track, Source: Gate Ventures

As mentioned in the GPU supply side, the pricing of cloud computing power providers is often related to GPU chips, but it is also related to factors such as bandwidth, cooling systems, and AI supporting development tools. In the AI industry chain section of the report, we also mentioned that due to the issue of the parameters and data capacity of large models, data transmission significantly affects the training time of large models. Therefore, bandwidth is often the main reason affecting large models, especially in the on-chain cloud computing field, where bandwidth and data exchange are slower and have a greater impact because it involves collaborative work by users from different locations. However, other cloud providers such as Azure have centralized HPC, which is more conducive to coordinating and improving bandwidth.

AI x Crypto: From Zero to Peak

Meson Network architecture diagram, Source: Meson

Taking Meson Network as an example, its vision for the future is that users can easily convert surplus bandwidth into tokens, and those in need can access global bandwidth in the Meson market. Users can store data in their databases, and other users can access data stored by the nearest user to accelerate the exchange of network data and speed up model training.

However, we believe that shared bandwidth is a pseudo-concept because for HPC, the data is mainly stored in local nodes, but for shared bandwidth, the data is stored at a certain distance (e.g., 1km, 10km, 100km) away. The latency caused by these geographical distances will be much higher than storing data locally, leading to frequent scheduling and allocation. Therefore, this pseudo-demand is also a reason why the market does not buy into it. Meson Network's latest round of financing valued it at $1 billion, but after listing on the exchange, its fully diluted valuation is only $9.3 million, less than 1/10 of the valuation.

Data

As mentioned in our discussion of the deep learning industry chain, the quality of large models is influenced by the number of model parameters, computing power, and data. This presents market opportunities for many data source companies and vector database providers, as they will provide various specific types of data services to enterprises.

AI x Crypto: From Zero to Peak

Some projects in the AI data provider industry, Source: Gate Ventures

The current projects launched include EpiK Protocol, Synesis One, Masa, etc. The difference lies in the fact that EpiK Protocol and Synesis One collect data from public sources, while Masa is based on ZK technology, enabling the collection of private data, which is more user-friendly.

Compared to traditional Web2 data enterprises, Web3 data providers have an advantage in data collection because individuals can contribute their non-private data (ZK technology can encourage users to contribute private data without revealing it). This broadens the project's reach, not only for B2B but also for pricing any user's data, giving value to past data. Additionally, due to the existence of tokenomics, the network's value and price are interdependent. Zero-cost tokens will increase as network value increases, and these tokens will reduce developers' costs, rewarding users and increasing their motivation to contribute data.

We believe that this mechanism, which can simultaneously access Web2 and Web3 and allows almost anyone to contribute their data at the user level, is very easy to achieve "Mass Adoption" within a certain scope. On the data consumption side, various models have real supply and demand, and users can easily click online with low operational difficulty. The only consideration is the issue of privacy computing, so data providers in the ZK direction may have a good development prospect, with Masa being a typical project.

ZKML

AI x Crypto: From Zero to Peak

ZK Training / Inference Projects, Source: Gate Ventures

If data needs to achieve privacy computing and training, the industry currently mainly uses ZK solutions, using homomorphic encryption technology to infer data off-chain and then upload the results and ZK proofs to ensure data privacy and low-cost efficiency. On-chain inference is not suitable. This is why investors in the ZKML track generally have higher quality, as this is the one that aligns with business logic.

Not only are there projects focused on off-chain training and inference in the field of artificial intelligence, but there are also some general-purpose ZK projects that can provide Turing-complete ZK collaborative processing capabilities, providing ZK proofs for any off-chain computation and data, including projects like Axiom, Risc Zero, and Ritual, which are also worth paying attention to. These types of projects have a broader application boundary and more fault tolerance for VCs.

AI Applications

AI x Crypto: From Zero to Peak

AI x Crypto application landscape, Source: Foresight News

The application of blockchain is similar to the traditional AI industry, with most of it focused on infrastructure development. Currently, the most prosperous development is still in the upstream industry chain, while the downstream industry chain, such as the application end, is relatively weak.

These AI+blockchain applications are more about traditional blockchain applications + automation and generalization. For example, DeFi can execute optimal trades and lending paths based on user ideas, and these applications are called AI Agents. The fundamental revolution of neural networks and deep learning technology lies in their generalization capabilities, adapting to the diverse needs of different populations and different modalities of data.

We believe that this generalization capability will first benefit AI Agents, serving as a bridge between users and various applications, helping users make complex on-chain decisions and choose the optimal path. Fetch.AI is a representative project in this regard (currently with a market cap of $21 billion), and we will briefly describe the working principle of AI Agents using Fetch.AI.

Reference

AI x Crypto: From Zero to Peak

Fetch.AI architecture diagram, Source: Fetch.AI

The above diagram shows the architecture of Fetch.AI. Fetch.AI defines an AI Agent as "a self-running program on the blockchain network that can connect, search, and trade, and can be programmed to interact with other agents in the network." DeltaV is the platform for creating Agents, and registered Agents form an Agent library called Agentverse. The AI Engine parses the user's text and purpose, then converts it into precise instructions that the agent can accept, and then finds the most suitable agent in Agentverse to execute these instructions. Any service can register as an agent, creating an intent-guided embedded network that is very suitable for embedding into applications such as Telegram, as all entry points are Agentverse, and any operation or idea entered in the chat box will have a corresponding agent executing it on-chain. Agentverse can complete on-chain application interaction tasks through a wide range of dApps. We believe that AI Agents have practical significance and native demand for the blockchain industry, as large models provide applications with a brain, but AI Agents provide applications with hands.

According to current market data, Fetch.AI currently has approximately 6,103 AI Agents online, and there is a possibility of overvaluation in terms of the number of agents, so the market is willing to give a higher premium for its vision.

Reference

AI Public Chains

Public chains such as Tensor, Allora, Hypertensor, AgentLayer, etc., are adaptive networks built specifically for AI models or agents, representing a native link in the AI industry chain.

AI x Crypto: From Zero to Peak

Allora architecture, Source: Allora Network

We will briefly describe the operation principle of this type of AI chain using Allora:

Consumers seek inference from the Allora Chain.
Miners run inference and prediction models off-chain.
Evaluators are responsible for evaluating the quality of the inference provided by miners. Evaluators are usually experts in authoritative fields and are responsible for accurately assessing the quality of the inference.

Similar to RLHF (reinforcement learning), inference is uploaded to the chain, and on-chain evaluators can improve model parameters by ranking the results, which is also beneficial for the model itself. Similarly, projects based on tokenomics can significantly reduce the cost of inference through token distribution, playing a crucial role in the project's development.

Compared to traditional AI models using RLHF algorithms, a scoring model is generally set, but this scoring model still requires human intervention and its cost cannot be reduced, with limited participants. In contrast, Crypto can bring more participants and further stimulate a wide network effect.

Conclusion

First and foremost, it is important to emphasize that the current discussions on AI development and industry chains are actually based on deep learning technology, which does not represent all directions of AI development. There are still many non-deep learning and promising technologies incubating, but due to the excellent performance of GPT, most of the market's attention has been drawn to this effective technological path.

Some industry giants also believe that current deep learning technology cannot achieve general artificial intelligence, so this technology stack may be a dead end. However, we believe that this technology already has its significance, and there are practical demand scenarios such as GPT. Similar to the recommendation algorithm of TikTok, although this type of machine learning cannot achieve artificial intelligence, it is indeed used in various information flows to optimize the recommendation process. Therefore, we still recognize that this field is worth rational and vigorous deep-rooted development.

Token and blockchain technology, as a means of redefining and discovering value (global liquidity), also have favorable aspects for the AI industry. In the AI industry, issuing tokens can reshape the value of all aspects of the AI industry chain, which will incentivize more people to deeply root in various segments of the AI industry, as the returns will become more significant, not just determined by cash flow. Furthermore, all projects in the AI industry chain will receive capital appreciation, and these tokens can also contribute back to the ecosystem and promote the birth of certain philosophical ideas.

The immutable and trustless nature of blockchain technology also has practical significance for the AI industry, as it can realize some applications that require trust. For example, our user data can be allowed on a certain model, but ensure that the model does not know the specific data, does not leak the data, and ensures the return of the real data inferred by the model. When GPUs are insufficient, distribution can be achieved through the blockchain network, and idle GPUs can contribute computing power to the network, reusing obsolete resources, something that only a global value network can accomplish.

The disadvantage of GPU computing networks lies in bandwidth, which can be addressed by concentrating bandwidth for HPC clusters to accelerate training efficiency. For shared GPU platforms, although idle computing power can be called upon and costs reduced (through token subsidies), the training speed becomes very slow due to geographical issues, making these idle computing power only suitable for non-urgent small models. Additionally, these platforms lack supporting developer tools, so in the current situation, medium and large enterprises are more inclined towards traditional cloud enterprise platforms.

In conclusion, we still recognize the practical utility of AI x Crypto combination. Tokenomics can reshape value and discover a broader perspective of value, while decentralized ledgers can solve trust issues, facilitate value flow, and discover surplus value.