a16z: How will blockchain and AI coexist in the future world?

Original video: Web3 with a16z, AI & Crypto

Compiled & Translated by: Qianwen, ChainCatcher

Stephen King once wrote a science fiction novel called "The Diamond Age," in which there is an artificial intelligence device that serves as a mentor throughout people's lives. When you are born, you are paired with an artificial intelligence that knows you very well—your preferences, and it follows you throughout your life, helping you make decisions and guiding you in the right direction. This sounds great, but you definitely don't want this kind of technology to fall into the hands of middleman giants. Because this would give the company a lot of control and raise a series of privacy and sovereignty issues.

We hope that this technology can truly belong to everyone, so a vision emerges, that is, you can use blockchain to achieve this. You can embed artificial intelligence in smart contracts. With the power of zero-knowledge proofs, you can maintain the privacy of data. Over the next few decades, this technology will become increasingly intelligent. You can choose to do anything you want, or change it in any way you wish.

So, what is the relationship between blockchain and artificial intelligence? Where will artificial intelligence lead us? What are the current status and challenges of artificial intelligence? What role will blockchain play in this process?

AI and Blockchain: Balancing Each Other

The development of artificial intelligence, including the scenario described in "The Diamond Age," has always existed, but it has recently experienced a leap in development.

First, artificial intelligence is largely a top-down, centralized control technology. Cryptography, on the other hand, is a bottom-up, decentralized collaboration technology. In many ways, cryptocurrency is a study of how to build decentralized systems that can enable large-scale human cooperation without true central control. In this sense, it is a natural way for these two technologies to merge.

Artificial intelligence is a sustainable innovation that can strengthen the business models of existing technology companies and help them make top-down decisions. The best example of this is Google, which can decide what content to present to users among billions of users and page views. Cryptocurrency, on the other hand, is fundamentally a disruptive innovation, and its business model is fundamentally contradictory to that of large tech companies. Therefore, this is a movement led by edge rebels, not by incumbents.

Therefore, artificial intelligence may be closely related to privacy protection in all aspects, and the two promote and interact with each other. As a technology, artificial intelligence has established various incentive mechanisms, leading to less and less privacy for users, as companies want to obtain all our data. And artificial intelligence models trained on more and more data may become more effective. On the other hand, artificial intelligence is not perfect, and biased models may lead to unfair results. Therefore, there are also many papers on algorithm fairness at this stage.

I believe that we will move towards a path of artificial intelligence, where everyone's data will be aggregated into these huge model training processes to optimize the models. Cryptocurrency, on the other hand, is moving in the opposite direction, increasing personal privacy and empowering users to control data sovereignty. It can be said that cryptographic technology is a technology that balances against artificial intelligence, as it can help us distinguish human-created content from AI-created content in a world flooded with AI-created content, and cryptographic technology will become an important tool for maintaining and preserving human content.

Cryptocurrency is like the Wild West, because it has no authority, and anyone can participate. You have to assume that some participants are malicious. Therefore, there is a greater need for tools to help you filter out honest participants from dishonest ones, and machine learning and artificial intelligence are actually very beneficial in this regard.

For example, there are projects that use machine learning to identify suspicious transactions submitted to wallets. These transactions will then be marked and submitted to the blockchain. This can effectively prevent users from accidentally submitting all their funds to attackers, or doing something they will regret later. Machine learning can also be a tool to help you predict in advance which transactions may have MEV.

Just as LLM models can be used to detect fake data or malicious activities, conversely, these models can also be used to generate fake data. The most typical example is deepfakes. You can create a video of someone saying something they never said. But blockchain can actually help mitigate this problem.

For example, there are timestamps on the blockchain showing that on a certain date, you said certain things. If someone fakes a video, you can use the timestamp to refute it. All this data, the truly authentic data, is recorded on the blockchain and can be used to prove that the deepfake video is indeed fake. So I believe that blockchain may help combat forgery.

We can also rely on trusted hardware to achieve this. Cameras and our devices such as phones will sign the images and videos they capture, serving as a standard. This is called C2PA, which specifies how cameras sign data. In fact, a camera from Sony can now capture photos and videos and generate C2PA signatures on the videos. This is a very complex topic, and we won't go into it here.

In general, newspapers do not publish photos as they were taken by the camera. They will crop and process the photos in some authorized way. Once you start editing the picture, it means that the recipient, the end reader, and the user on the browser are not seeing the original picture, and the C2PA signature cannot be verified.

The question is, how can we confirm that the images users see are indeed signed correctly by the C2PA camera? This is where ZK technology comes in, as you can prove that the edited image is actually the result of downsampling and grayscale scaling of the correctly signed image. Now, readers can still confirm that they are seeing the real images. Therefore, ZK technology can be used to counteract this information.

How Does Blockchain Break the Status Quo?

Artificial intelligence is essentially a centralized technology. It largely benefits from economies of scale, as running on a single data center makes things more efficient. In addition, data, machine learning models, and machine learning talent are usually controlled by a few tech companies.
So how can we break this pattern? Cryptocurrency can help us achieve decentralization in artificial intelligence by using technologies like ZKML, which can be applied to data centers, databases, and machine learning models themselves. For example, in terms of computation, using zero-knowledge proofs, users can prove that the actual reasoning or model training process is correct.

In this way, you can outsource this process to a large community. In this distributed process, anyone with a GPU can contribute computing power to the network and train models in this way, without relying on a large data center that centralizes all GPUs.

From an economic perspective, it is uncertain whether this makes sense. But at least, with the right incentives, the long tail effect can be achieved. You can leverage all possible GPU capabilities. Let all these people contribute computing power to model training or reasoning, which can replace the large tech companies that control everything. To achieve this, various important technical issues must be addressed. In fact, there is a company called NVIDIA that is building a decentralized GPU computing market primarily for training machine learning models. In this market, anyone can contribute their GPU computing power. On the other hand, anyone can use any computing in the network to train their large machine learning models. This will be an alternative choice to centralized large tech companies like OpenAI, Google, and Meta.

Imagine this scenario: Alice has a model she wants to protect. She wants to send the model to Bob in encrypted form. Now that Bob has received the encrypted model, he needs to run his own data on this encrypted model. How can this be done? This requires the use of so-called fully homomorphic encryption to compute encrypted data. If a user has an encrypted model and plaintext data, they can run the encrypted model on the plaintext data, receive and obtain encrypted results. You send the encrypted results back to Alice, and she can decrypt and see the plaintext results.

This technology already exists. The question is, is the current technology effective for medium-sized models, and can it be extended to larger models? This is a fairly large challenge that requires more effort from companies.

Current Status, Challenges, and Incentive Mechanisms

I believe that to achieve decentralization in computation, the first issue is the verification problem. You can use ZK to solve this problem, but currently these technologies can only handle relatively small models. The challenge we face is that the performance of these cryptographic primitives is far from sufficient to handle the training or reasoning of very large models. Therefore, a lot of work is being done to improve the performance of the proving process so that increasingly large workloads can be efficiently proven.

At the same time, some companies are also using other technologies, not just cryptographic technologies. They are using game-theoretic properties that allow more independent people to work. This is an optimistic approach that does not rely on cryptography, but it is still consistent with the larger goal of decentralized artificial intelligence or helping to create an artificial intelligence ecosystem. This is a goal proposed by companies like OpenAI.

The second major issue is the problem of distributed systems. For example, how do you coordinate a large community to contribute GPU to a network, making it feel like an integrated, unified computing base? There are many challenges in this, such as how to reasonably decompose the workload of machine learning and allocate different workloads to different nodes in the network, and how to efficiently complete all this work.

Current technology can basically be applied to medium-sized models, but cannot be applied to large models like GPT-3 or GPT-4. Of course, we have other methods. For example, we can have multiple people train and then compare results, creating a game-theoretic incentive mechanism. This incentivizes people not to cheat. If someone cheats, others may complain that their calculated training results are incorrect. As a result, the cheater will not receive any rewards.

We can also decentralize the sources of data to train large machine learning models in the community. Similarly, we can collect all the data and train the models ourselves, rather than relying on a centralized institution. This can be achieved by creating a market. This is similar to the computing market we described earlier.

We can also take an incentive perspective, encouraging people to contribute new data to a large dataset for model training. The difficulties here are similar to the verification challenge. You must somehow verify that the data people contribute is indeed good data. This data is neither duplicate data, nor randomly generated junk data, nor unrealistically generated data in some way.

Furthermore, it must be ensured that the data does not subvert the model in some way, otherwise the model's performance will actually deteriorate. Perhaps we must rely on a combination of technical and social solutions. In this case, you can also establish credibility by some site metrics that a community member can obtain when they contribute data, making the data more credible than before.

Otherwise, it will take a very long time to truly cover the data distribution. One of the major challenges of machine learning is that the model can only cover the distribution range of the training dataset. If some inputs are far beyond the distribution range of the training data, then your model may behave completely unpredictably. In order for the model to perform well in edge cases, black swan data points, or data inputs that may be encountered in the real world, we need a comprehensive dataset as much as possible.

Therefore, if you have an open, decentralized market for datasets, you can allow anyone in the world who has unique data to provide it to the network, which is a better way. Because if you try to do this as a central company, you have no way of knowing who owns this data. Therefore, if an incentive mechanism can be created to encourage these people to voluntarily come forward and provide this data, then I believe you can actually achieve significantly better coverage of long-tail data.

So we must have a mechanism to ensure that the data you provide is authentic. One way is to rely on trusted hardware, where the sensors themselves embed some trusted hardware, and we only trust data that is correctly signed by the hardware. Otherwise, we must have other mechanisms to distinguish the authenticity of the data.
There are currently two important trends in machine learning. First, the performance measurement methods of machine learning models are constantly improving, but they are still in the early stages, and it is actually difficult to judge the performance of another model. Another trend is that we are becoming increasingly good at explaining how the model works.

Therefore, based on these two points, at some point, I might be able to understand the impact of the dataset on the performance of machine learning models. If we can understand whether the dataset contributed by a third party helps the performance of the machine learning model, then we can reward this contribution and create incentives for the existence of this market.
Imagine if you could create an open market where people contribute trained models to solve specific types of problems, or if a smart contract is created with embedded testing, and if someone can use ZKML to provide a model and prove that the model can solve the test, this is a possible outcome. You now have the tools needed to create a market, and when people contribute machine learning models that can solve certain problems, the market will be incentivized.

How Does AI and Cryptocurrency Form a Business Model?

I believe the vision behind the intersection of cryptocurrency and artificial intelligence is that you can create a protocol to distribute the value obtained from this new technology of artificial intelligence to more people, where everyone can contribute and share the benefits of this new technology.

Therefore, those who can profit are the ones who contribute computing power, contribute data, or contribute new machine learning models to the network, so that better machine learning models can be trained to solve more important problems.

The demand side of the network can also profit. They use this network as infrastructure to train their own machine learning models. Perhaps their models can contribute something interesting, such as the next generation of chat tools. In these models, because these companies will have their own business models, they can drive value acquisition themselves.

The people who build this network will also profit. For example, creating a token for the network, which will be distributed to the community. All these people will collectively own this decentralized network, used for computing data and models, and can also obtain some value from all economic activities conducted through this network.

You can imagine that for every transaction conducted through this network, every payment method for computing fees, data fees, or model fees, a certain fee may be charged, which will go into a treasury controlled by the entire network. Token holders collectively own this network. This is essentially the business model of the network itself.

How AI Promotes Code Security

Many listeners may have heard of Copilot, a tool used to generate code. You can try using these collaborative generation tools to write Solidity contracts or cryptographic code. I want to emphasize that doing so is actually very dangerous. Because many times, when you try to run it, these systems actually generate code that can run but is not secure.

In fact, we recently wrote a paper on this issue, which pointed out that if you try to have a copilot write a simple encryption function, the encryption function it provides is correct. But it uses an incorrect mode of operation, so you will end up with an insecure encryption mode.

You might ask, why does this happen? One reason is that these models are basically trained based on existing code, and they are trained in GitHub repositories. Many GitHub repositories are actually vulnerable to various attacks. Therefore, the code learned by these models can work normally, but is not secure. It's like garbage in, garbage out. Therefore, I hope people will be very careful when using these generation models to generate code, carefully checking whether the code really does what it should do, and does it safely.

You can use artificial intelligence models, combined with other tools to generate code, to ensure that the entire process does not go wrong. For example, one idea is to use an LLM model to generate specifications for a formal verification system, requiring the LLM to generate a specification for the formal verification tool. Then, require the same LLM instance to generate a program that complies with the specification, and then use the formal verification tool to see if the program really complies with the specification. If there are vulnerabilities, the tool will also capture them. These errors can be fed back to the LLM, and ideally, the LLM can modify its work and then generate another correct version of the code.

Finally, if you repeat the process, you will eventually get a piece of code that, ideally, fully meets the return value, and after formal verification, also meets the return value. And because humans can read this trace, you can see through this trace that this is the program you want to write. In fact, many people are trying to evaluate the ability of LLM to find software vulnerabilities, such as uniting smart contracts, C and C plus.

So, will we reach a point where code generated by LLM is less likely to contain bugs than code generated by humans? For example, when we talk about autonomous driving, are we concerned that it is less likely to crash than human drivers? I believe this trend will only become stronger, and the degree to which artificial intelligence technology is integrated into existing toolchains will also increase.

You can integrate it into formal verification toolchains, and you can also integrate it into other tools, such as the tools mentioned earlier for checking memory management issues. You can also integrate it into unit testing and integration testing toolchains, so that LLM is not just operating in a vacuum. It can receive real-time feedback from other tools, connecting it to the ground reality.

I believe that by using super large machine learning models trained on all the data in the world, combined with these other tools, it may make computer programs better than human programmers. Even though they will still make mistakes, they may be superhuman. This will be an important moment in software engineering.

AI and Social Graphs

Another possibility is that we may be able to build a decentralized social network, whose behavior is actually very similar to Weibo, but the social graph is completely on the chain. It is almost like a public product, and anyone can build on it. As a user, you can control your identity on the social graph. You can control your data, control who you follow, and who can follow you. In addition, there are a large number of companies building portals in the social graph, providing experiences similar to Twitter, Instagram, TikTok, or any other experience they want to build.

But all of this is built on the same social graph, which no one owns, and no tech company worth billions of dollars completely controls it.

This is an exciting world, because it means it can be more dynamic, and there can be an ecosystem built by people together. Every user can have more control over what they see and do on the platform.

But at the same time, users also need to filter out the noise from the signal. For example, reasonable recommendation algorithms need to be developed to filter all content and show you the news sources you really want to see. This will open the door to a competitive environment made up of participants providing services for the entire market. You can use algorithms, use AI-based algorithms to curate content for you. As a user, you can decide whether to use a specific algorithm, perhaps an algorithm created by Twitter, or another algorithm. But at the same time, you also need tools like "machine learning" to help you filter out noise, help you parse all the junk information, in a world where generative models can create all the junk information in the world.

Why is Human Proof Important?

A very relevant question is, in a world where AI-generated content is rampant, how do you prove that you are indeed human?

Biometric technology is a possible direction, where a project called World Coin uses retinal scans as biometric information to verify if you are a real person, to ensure that you are indeed a living person, not just a photo of an eye. This system has secure hardware that is difficult to tamper with, so the proof on the other end, which is the zero-knowledge proof that conceals your actual biometric information, is difficult to forge in this way.

On the internet, no one knows if you are a robot. Therefore, I think this is where human proof projects become very important, because knowing whether you are interacting with a robot or a human will become very important. If you don't have human proof, then you cannot determine whether an address belongs to a person, or a group of people, or whether ten thousand addresses really belong to one person, or just pretend to be ten thousand different people.

This is crucial in governance. If every participant in the governance system can prove that they are actually human, and they can prove themselves as humans in a unique way, because they only have one set of eyeballs, then the governance system will be fairer and less plutocratic (based on the preference for the largest amount locked in a smart contract).

AI and Art

AI models mean that we will live in a world with an infinitely rich media, where communities around any specific media or narratives around specific media will become increasingly important.

For example, Sound.xyz is building a decentralized music streaming platform, allowing artists and musicians to upload music, and then directly connect with our community by selling NFTs to them. For example, you can comment on tracks on the sound dot xyz website, so that others who play the song can also see the comments. This is similar to the previous SoundCloud feature. The act of purchasing NFTs also supports artists, helping them achieve sustainable development and create more music. But the wonderful thing about all this is that it actually provides a platform for artists to truly interact with the community. The artists belong to everyone.

Due to the role of cryptocurrency here, you can create a community around a piece of music, and if a piece of music is only created by a machine learning model, without any human elements, then this community will not exist.

Much of the music we will encounter will be completely generated by artificial intelligence, and tools for building communities, storytelling around art, music, and other types of media will be very important, distinguishing the media we truly care about, want to invest in, and spend time engaging with from other general media.

There may be some synergistic effects between these two, such as many music pieces being enhanced or generated by artificial intelligence. But if there are also human elements involved, such as creators using artificial intelligence tools to create a new piece of music, they have their own voice characteristics, their own artist page, their own community, and their own followers.

Now, there is a synergistic effect between these two worlds, where you have the best music, because artificial intelligence gives everyone superpowers. But at the same time, you also have human elements and stories, which are coordinated and realized through encryption technology, allowing you to gather all these people on one platform.

In terms of content generation, this is definitely a brand new world. So how do we distinguish human-generated art that needs support from machine-generated art?

This actually opens the door to collective art, art generated through the creative process of the entire community, rather than individual artists. Some projects are already doing this, where the community influences the chain through some voting procedures, and generates art based on prompts from machine learning models. Perhaps what you generate is not a piece of art, but ten thousand pieces. Then you use another machine learning model, which is also trained based on community feedback, to pick the best one from these 10,000 works.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。