IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Original Author: IOSG Ventures

Thanks for the feedback from Zhenyang@Upshot, Fran@Giza, Ashely@Neuronets, Matt@Valence, Dylan@Pond.

This study aims to explore which artificial intelligence fields are most important for developers and which may be the next breakout opportunities in the Web3 and artificial intelligence fields.

Before sharing new research perspectives, we are delighted to have participated in the first round of financing totaling $5 million for RedPill, and we are very excited and look forward to growing together with RedPill!

TL;DR

As the combination of Web3 and AI becomes a hot topic in the cryptocurrency world, the construction of AI infrastructure in the crypto world is thriving. However, there are not many actual applications using AI or built for AI. The homogenization problem of AI infrastructure is gradually becoming apparent. The recent first round of financing for RedPill, in which we participated, has triggered a deeper understanding.

The main tools for building AI Dapps include decentralized OpenAI access, GPU networks, inference networks, and proxy networks.
The reason why GPU networks are more popular than during the "Bitcoin mining era" is because: the AI market is larger and growing rapidly and steadily; AI supports millions of applications every day; AI requires diverse GPU models and server locations; the technology is more mature than before; and the customer base is broader.
Inference networks and proxy networks have similar infrastructure but different focuses. Inference networks are mainly for experienced developers to deploy their own models, and running non-LLM models does not necessarily require a GPU. Proxy networks are more focused on LLM, where developers do not need to bring their own models but focus more on engineering hints and how to connect different proxies. Proxy networks always require high-performance GPUs.
AI infrastructure projects promise huge commitments and are constantly introducing new features.
Most native crypto projects are still in the testnet stage, with poor stability, complex configurations, limited functionality, and need time to prove their security and privacy.
Assuming that AI Dapps become a major trend, there are still many undeveloped areas, such as monitoring, infrastructure related to RAG, Web3 native models, decentralized proxies with built-in encrypted native APIs and evaluation networks.
Vertical integration is a significant trend. Infrastructure projects attempt to provide one-stop services to simplify the work of AI Dapp developers.
The future will be hybrid. Some inferences are made on the front end, while some are computed on the chain, considering cost and verifiability factors.

IOSG Ventures: Where is the way out for homogenized AI infrastructure? Source:IOSG

Introduction

The combination of Web3 and AI is one of the most prominent topics in the current crypto space. Talented developers are building AI infrastructure for the crypto world, aiming to bring intelligence to smart contracts. Building AI Dapps is an extremely complex task, and developers need to deal with a range of aspects including data, models, computing power, operations, deployment, and integration with the blockchain. In response to these needs, Web3 founders have developed many preliminary solutions, such as GPU networks, community data labeling, community-trained models, verifiable AI inference and training, and proxy stores.
Despite this thriving infrastructure, there are not many actual applications using AI or built for AI. When developers search for AI Dapp development tutorials, they find that there are not many tutorials related to native crypto AI infrastructure, with most tutorials only involving calling OpenAI API on the front end.

IOSG Ventures: Where is the way out for homogenized AI infrastructure? Source:IOSGVentures

Current applications have not fully utilized the decentralization and verifiability features of blockchain, but this situation will soon change. Now, most AI infrastructure focused on the crypto space has launched test networks and plans to go live within the next 6 months.
This study will provide a detailed introduction to the main tools available in the AI infrastructure of the crypto space. Let's get ready for the GPT-3.5 moment in the crypto world!

1. RedPill: Decentralized Authorization for OpenAI

The RedPill we invested in earlier is a good entry point as mentioned earlier.

OpenAI has several world-class powerful models, such as GPT-4-vision, GPT-4-turbo, and GPT-4o, which are the preferred choices for building advanced AI Dapps.

Developers can integrate these models into their dApps through oracles or front-end interfaces by calling the OpenAI API.

RedPill integrates the OpenAI APIs of different developers under one interface, providing fast, economical, and verifiable AI services to global users, democratizing access to top AI model resources. RedPill's routing algorithm directs developers' requests to individual contributors. API requests are executed through its distribution network, bypassing any potential restrictions from OpenAI, addressing some common issues faced by crypto developers, such as:

TPM (tokens per minute) limits: New accounts have limited token usage, which cannot meet the needs of popular and AI-dependent dApps.
Access restrictions: Some models have access restrictions for new accounts or certain countries.

By using the same request code but changing the hostname, developers can access OpenAI models at a low cost, with high scalability, and without restrictions.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

2. GPU Network

In addition to using OpenAI's API, many developers choose to host models themselves. They can rely on decentralized GPU networks, such as io.net, Aethir, Akash, and other popular networks, to build GPU clusters and deploy and run various powerful internal or open-source models.

Such decentralized GPU networks can leverage the computing power of individuals or small data centers, providing flexible configurations, more server location choices, and lower costs, allowing developers to easily conduct AI-related experiments within a limited budget. However, due to the decentralized nature, these GPU networks still have certain limitations in terms of functionality, availability, and data privacy.

In the past few months, there has been a surge in demand for GPUs, surpassing the previous Bitcoin mining craze. The reasons for this phenomenon include:

The target customer base is growing, and GPU networks now serve AI developers, whose numbers are not only large but also more loyal, unaffected by cryptocurrency price fluctuations.
Compared to mining-specific equipment, decentralized GPUs offer more models and specifications, better meeting the requirements. Especially for large model processing, higher VRAM is needed, while smaller tasks have more suitable GPU options. Additionally, decentralized GPUs can provide services to end users at close range, reducing latency.
The technology is becoming more mature, and GPU networks rely on high-speed blockchains such as Solana settlement, Docker virtualization technology, and Ray computing clusters.
In terms of investment returns, the AI market is expanding, with many opportunities for the development of new applications and models. The expected return rate for the H100 model is 60-70%, while Bitcoin mining is more complex, with limited production and winner-takes-all.
Bitcoin mining companies such as Iris Energy, Core Scientific, and Bitdeer are also beginning to support GPU networks, providing AI services, and actively purchasing GPUs designed for AI, such as the H100.

Recommendation: For Web2 developers who do not prioritize SLA, io.net provides a simple and user-friendly experience, making it a cost-effective choice.

3. Inference Networks

This is the core of native crypto AI infrastructure. It will support billions of AI inference operations in the future. Many AI layer1 or layer2 provide developers with the ability to natively call AI inference on the chain. Market leaders include Ritual, Valence, and Fetch.ai.

These networks differ in the following aspects:

Performance (latency, computation time)
Supported models
Verifiability
Price (on-chain consumption cost, inference cost)
Development experience

3.1 Objectives

Ideally, developers should be able to access custom AI inference services anywhere, with any form of proof, with almost no obstacles during integration.

Inference networks provide all the foundational support developers need, including on-demand generation and verification of proofs, inference computation, relay and verification of inference data, providing interfaces for Web2 and Web3, one-click model deployment, system monitoring, cross-chain operations, synchronous integration, and scheduled execution.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source:IOSGVentures

With these features, developers can seamlessly integrate inference services into their existing smart contracts. For example, when building DeFi trading bots, these bots will use machine learning models to find buying and selling opportunities for specific trading pairs and execute corresponding trading strategies on the underlying trading platform.

In the completely ideal state, all infrastructure is cloud-hosted. Developers only need to upload their trading strategy models in a universal format such as torch, and the inference network will store and provide models for Web2 and Web3 queries.

Once all model deployment steps are completed, developers can directly call model inference through Web3 API or smart contracts. The inference network will continue to execute these trading strategies and provide feedback to the underlying smart contracts. If the developer manages a large amount of community funds, verification of the inference results is also required. Once the inference results are received, the smart contract will execute trades based on these results.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source:IOSGVentures

3.1.1 Asynchronous and Synchronous

In theory, asynchronous inference operations can provide better performance; however, this approach may be inconvenient in terms of development experience.

When using the asynchronous approach, developers need to first submit tasks to the smart contract of the inference network. When the inference task is completed, the smart contract of the inference network will return the results. In this programming mode, the logic is divided into two parts: inference calls and inference result processing.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source:IOSGVentures

If developers have nested inference calls and a large amount of control logic, the situation will become even worse.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source:IOSGVentures

The asynchronous programming mode makes it difficult to integrate with existing smart contracts. This requires developers to write a lot of additional code, handle errors, and manage dependencies.

In comparison, synchronous programming is more intuitive for developers, but it introduces issues in response time and blockchain design. For example, if the input data is rapidly changing data such as block time or price, then by the time the inference is completed, the data is no longer fresh, which may lead to the need to roll back the execution of smart contracts in specific cases. Imagine using outdated prices for trading.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source:IOSGVentures

Most AI infrastructure adopts asynchronous processing, but Valence is trying to address these issues.

3.2 Reality

In reality, many new inference networks are still in the testing phase, such as the Ritual network. According to their public documents, these networks currently have limited functionality (features such as verification, proof, etc. are not yet online). They do not currently provide a cloud infrastructure to support on-chain AI computation, but instead provide a framework for self-hosted AI computation and passing results to the chain.

This is an architecture running AIGC NFT. The diffusion model generates NFT and uploads it to Arweave. The inference network will use this Arweave address to mint the NFT on the chain.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source:IOSGVentures

This process is very complex, and developers need to deploy and maintain most of the infrastructure themselves, such as Ritual nodes with custom service logic, Stable Diffusion nodes, and NFT smart contracts.

Recommendation: Currently, most inference networks are quite complex in terms of integrating and deploying custom models, and at this stage, most networks do not support verification functionality. Applying AI technology to the front end would provide a relatively simple choice for developers. If verification functionality is highly needed, ZKML provider Giza is a good choice.

4. Proxy Networks

The proxy network allows users to easily customize proxies. Such networks are composed of entities or smart contracts that can autonomously execute tasks, interact with each other, and interact with blockchain networks, all without direct human intervention. It is mainly targeted at LLM technology. For example, it can provide a GPT chatbot with in-depth knowledge of Ethereum. The current tools for such chatbots are limited, and developers cannot yet develop complex applications based on them.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source: IOSGVentures

However, in the future, the proxy network will provide more tools for proxies to use, not only knowledge but also the ability to call external APIs and perform specific tasks. Developers will be able to connect multiple proxies to build workflows. For example, writing Solidity smart contracts involves multiple specialized proxies, including protocol design proxies, Solidity development proxies, code security review proxies, and Solidity deployment proxies.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

Source: IOSGVentures

We coordinate the cooperation of these proxies using prompts and scenarios.

Some examples of proxy networks include Flock.ai, Myshell, and Theoriq.

Recommendation: The functionality of most proxies today is relatively limited. For specific use cases, Web2 proxies can better serve and have mature orchestration tools, such as Langchain and Llamaindex.

5. Differences Between Proxy Networks and Inference Networks

Proxy networks focus more on LLM, providing tools such as Langchain to integrate multiple proxies. In most cases, developers do not need to develop machine learning models themselves, as the proxy network has simplified the process of model development and deployment. They only need to link the necessary proxies and tools. In most cases, end users will directly use these proxies.

Inference networks, on the other hand, are the infrastructure support for proxy networks. They provide developers with lower-level access rights. In normal circumstances, end users do not directly use inference networks. Developers need to deploy their own models, not limited to LLM, and they can use them through on-chain or off-chain access points.

Proxy networks and inference networks are not completely independent products. We have already begun to see some vertically integrated products. They provide both proxy and inference capabilities because these two functions rely on similar infrastructure.

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

6. New Opportunities

In addition to model inference, training, and proxy networks, there are many new areas worth exploring in the web3 domain:

Datasets: How to transform blockchain data into machine-learning-ready datasets? Machine learning developers need more specific and specialized data. For example, Giza provides high-quality datasets specifically for DeFi machine learning training. Ideal data should not only be simple tabular data but also include graph data that describes interactions in the blockchain world. Currently, we are lacking in this area. Some projects are addressing this issue by rewarding individuals to create new datasets while promising to protect the privacy of personal data, such as Bagel and Sahara.
Model Storage: Some models have large volumes, and it is crucial to store, distribute, and version control these models, which is related to the performance and cost of on-chain machine learning. In this area, pioneering projects such as Filecoin, AR, and 0g have made progress.
Model Training: Distributed and verifiable model training is a challenge. Projects like Gensyn, Bittensor, Flock, and Allora have made significant progress.
Monitoring: As model inference occurs both on-chain and off-chain, we need new infrastructure to help web3 developers track the usage of models, identify potential issues and biases in a timely manner. With suitable monitoring tools, web3 machine learning developers can make timely adjustments and continuously optimize model accuracy.
RAG Infrastructure: Distributed RAG requires a completely new infrastructure environment with high demands for storage, embedding computation, and vector databases, while ensuring the privacy and security of data. This is different from the current Web3 AI infrastructure, which mostly relies on third parties to complete RAG, such as Firstbatch and Bagel.
Customized Models for Web3: Not all models are suitable for the Web3 scenario. In most cases, models need to be retrained to adapt to specific applications such as price prediction and recommendations. With the flourishing development of AI infrastructure, we expect to see more web3-native models to serve AI applications in the future. For example, Pond is developing a blockchain GNN for various scenarios such as price prediction, recommendations, fraud detection, and anti-money laundering.
Evaluation Networks: Evaluating proxies without human feedback is not easy. With the proliferation of proxy creation tools, there will be countless proxies in the market. This requires a system to showcase the capabilities of these proxies and help users determine which proxy performs best in specific situations. For example, Neuronets is a participant in this field.
Consensus Mechanisms: For AI tasks, PoS may not be the best choice. The main challenges facing PoS are computational complexity, verification difficulties, and lack of determinism. Bittensor has created a new intelligent consensus mechanism that rewards nodes in the network for contributing to machine learning models and outputs.

7. Future Outlook

We are currently observing a trend of vertical integration. By building a foundational computing layer, networks can provide support for a variety of machine learning tasks, including training, inference, and proxy network services. This model aims to provide web3 machine learning developers with a comprehensive one-stop solution.

Currently, on-chain inference, despite being costly and slow, offers excellent verifiability and seamless integration with backend systems (such as smart contracts). I believe the future will move towards a hybrid application approach. Some inference processing will occur on the front end or off-chain, while critical and decisive inference will be completed on-chain. This model has already been applied on mobile devices. By leveraging the inherent characteristics of mobile devices, it can quickly run small models locally and migrate more complex tasks to the cloud, utilizing larger LLM processing.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

IOSG Ventures: Where is the way out for homogenized AI infrastructure?

TL;DR

Introduction

1. RedPill: Decentralized Authorization for OpenAI

2. GPU Network

3. Inference Networks

3.1 Objectives

3.2 Reality

4. Proxy Networks

5. Differences Between Proxy Networks and Inference Networks

6. New Opportunities

7. Future Outlook

Selected Articles by Odaily星球日报

Table of Contents

Related Articles