Deciphering Cysic: The Eve of the Rise of Hardware Acceleration and ZK Mining

CN
10 months ago

Cysic will help the vision of ZK's large-scale adoption come true?

Authors: Nickqiao & 雾月, Geek web3

In April of this year, Vitalik visited the Hong Kong Blockchain Summit and delivered a speech titled "Reaching the Limits of Protocol Design," in which he once again mentioned the potential of ZK-SNARKs in the Ethereum Danksharding roadmap and looked forward to the huge help of ASIC chips for ZK acceleration.

Previously, Zhang Ye, co-founder of Scroll, also pointed out that the application space of ZK in traditional fields may be even larger than in Web3, with huge demand for ZK in areas such as trusted computing, databases, verifiable hardware, content anti-counterfeiting, and zkML. If real-time generation of ZK proofs can be implemented, both Web3 and traditional industries are expected to undergo a paradigm-level change. However, from the perspective of efficiency and economic cost, it is still far from achieving large-scale adoption of ZK.

In fact, as early as 2022, top venture capital firms a16z and Paradigm publicly released a report, explicitly expressing their emphasis on ZK hardware acceleration. Paradigm even asserted that the future income of ZK miners may rival that of Bitcoin or Ethereum miners, and hardware acceleration solutions based on GPU, FPGA, and ASIC will have huge market space. Subsequently, with the popularity of mainstream ZK Rollups such as Scroll and Starknet, hardware acceleration has become a hot concept in the market, and this enthusiasm has become even stronger with the upcoming launch of projects like Cysic.

We have reason to believe that based on the huge demand for ZK, the SaaS model of ZK mining pools and real-time ZKP generation can open up a new industrial chain. In this highly potential new territory, hardware manufacturers with strong support and first-mover advantage in ZK hardware acceleration are fully capable of becoming the next generation of "Bitmain," dominating the soil of hardware acceleration.

In the field of hardware acceleration, Cysic may be one of the most anticipated contenders. The team has won important awards from the well-known ZKP technology competition platform ZPrize and has been a mentor for ZPrize since 2023. Its roadmap includes ToB-end ZK mining pools and ToC ZK-Depin hardware, which has attracted the attention of top VCs such as Polychain, ABCDE, OKX Ventures, and Hashkey, completing a total of nearly $20 million in financing.

With the Cysic testnet set to launch at the end of July and the opening of its ZK mining pool imminent, discussions about Cysic in various communities are becoming increasingly heated. This article aims to help more people understand the product principles and business models of Cysic, and provide a simple popular science explanation of ZK hardware acceleration principles. In the following text, we will briefly summarize the relevant knowledge of Cysic to help more people reduce the threshold of understanding.

Understanding the ZK Proof System from the Workflow

The ZK proof system is actually very complex, but if you want to have a simple understanding of its general structure, you can break it down from the perspective of functions and workflow. For a system that ZK-izes ordinary computations, the core process can be summarized as follows:

First, we need to interact with the ZK system through the front end to submit the content to be proved. The front end will format this content to make it easier for the ZK proof system to process. Then, the system will generate a ZK Proof through specific proof systems or frameworks (such as Halo2, Plonk, etc.). This process can be further divided into the following steps:

1. Problem setting: First, we need to determine what content needs to be proved. For example, the prover declares that they know/have a certain piece of data, "I know a solution N to the equation F(x)=w," but they do not want others to see the value of N.

2. Arithmeticization and CSP: After the prover submits the content to be proved, the system will establish a specialized mathematical model/program to express the content to be proved equivalently, and then format it for the proof system to process. Specifically, the aforementioned declaration "I know a solution N to the equation F(x)=w" will be transformed from the original mathematical equation into the form of logical gate circuits and polynomials.

  1. Then, the system will select suitable proof systems such as Halo, Plonk, etc., and compile the content generated in the previous steps into a usable ZKP program. The prover uses this ZKP program to generate a proof, which is then verified by the verifier.

Systems like zkEVM, which are frequently adopted in Ethereum Layer 2, essentially compile smart contracts into the underlying operation codes of EVM, then format each operation code into the form of logical gate circuits/polynomial constraints, and further process them by the backend ZK proof system.

It is worth mentioning that the widely used ZKP technology solutions in blockchain currently mainly involve zk-SNARK (Zero-Knowledge Succinct Non-interactive Argument of Knowledge), and most ZK Rollups utilize the succinctness of SNARK rather than its zero-knowledge property. Succinctness means that ZKP occupies a very small space and can compress a large amount of content into a few hundred bytes, resulting in very low verification costs.

As a result, the workload between the prover and verifier is asymmetric. The cost of generating ZKP is very high for the prover, while the verification cost is very low for the verifier. By leveraging this asymmetry, using ZK in the "single prover, multiple verifiers" scenario can concentrate the overall cost on the prover side, greatly reducing the cost for the verifier. This model is extremely advantageous for decentralized verification, which is the approach taken by Ethereum Layer 2.

However, this model of shifting verification costs to the ZK generation side is not a silver bullet. For ZK Rollup projects, the high cost of generating ZKP will ultimately be transferred back to user experience and transaction fees, which is not conducive to the long-term development of ZK Rollup.

Despite the significant potential of ZK in scenarios of trustless and decentralized verification, due to the bottleneck in generation time, neither zkEVM, zkVM, ZK Rollup, nor ZK bridges currently have the economic foundation for large-scale adoption.

In response, ZK acceleration projects represented by Cysic, Ingonyama, and Irreducible have emerged, each attempting to reduce the generation cost of ZKP from different directions. In the following text, we will briefly introduce the main expenses and acceleration methods of ZKP generation from a technical perspective, and explain why Cysic has huge potential in the ZK acceleration race.

Computational expenses: MSM and NTT

Many people are aware that the time and computational expenses for Prover to generate a ZKP proof are extremely high. In the ZK-SNARK protocol, it is common for the Verifier to be able to verify the proof in just one second, while the generation of the proof may take the Prover half a day or even a full day. In order to efficiently use ZKP proof computation, it is necessary to convert the computation format from classical programs to ZK-friendly formats.

Currently, there are two methods to achieve this: one is to use some proof system frameworks to write circuits, such as Halo2; the other is to use domain-specific languages (DSL), such as Cairo or Circom, to convert computations into intermediate representations for submission to the proof system. The proof system will then generate ZK proofs based on the circuits or DSL-compiled intermediate representations.

The more complex the program operations, the longer it takes to generate the proof. Additionally, certain operations are inherently unfriendly to ZK, requiring additional work to implement them. For example, the SHA or Keccak hash functions are unfriendly to ZKP, and using these functions will result in longer proof generation times. Even operations with low execution costs on classical computers may be unfriendly to ZKP.

Apart from the unfriendly computational tasks for ZK, although the generation process of ZK proofs may vary depending on the chosen proof system, the bottleneck is essentially similar. In the generation of ZK proofs, there are two computational tasks that consume the most computing resources: Multi-Scalar Multiplication (MSM) and Number Theoretic Transform (NTT). These two tasks can account for 80-95% of the proof generation time, depending on the commitment scheme of the ZKP and the specific implementation.

MSM mainly deals with multiple scalar multiplication on elliptic curves, while NTT is a Fast Fourier Transform (FFT) on finite fields, used to accelerate polynomial multiplication. Different combinations of schemes will result in different FFT/MSM load ratios.

Taking Stark as an example, its Polynomial Commitment Scheme (PCS) uses FRI, a hash-based commitment, instead of elliptic curves used by KZG or IPA, completely eliminating MSM calculations. The higher up in the table, the more FFT calculations are required, while the lower down requires more MSM calculations.

Optimization Solutions

Since MSM operations involve predictable memory access and can be highly parallelized, they require a significant amount of memory resources. Additionally, MSM also faces scalability challenges, and even with parallelization, it can still be slow. While MSM can potentially be accelerated in hardware, it requires significant memory and parallel computing resources.

NTT often involves random memory access, making it unfriendly to hardware, and difficult to handle in distributed infrastructure. This is because of the random access nature of NTT, which, when running in a distributed environment, inevitably requires access to data from other nodes. Once network interaction is involved, performance will significantly decrease.

Therefore, accessing stored data and data movement become a major bottleneck, limiting the parallelization capabilities of NTT calculations. Most of the work to accelerate NTT is focused on managing how the computation interacts with memory.

In fact, the simplest way to solve the efficiency bottleneck of MSM and NTT is to completely eliminate these operations. Some newly proposed algorithms, such as Hyperplonk, modify Plonk to eliminate NTT operations. This makes Hyperplonk easier to accelerate, but introduces new bottlenecks, as well as higher computational costs for protocols like sumcheck. There is also the STARK algorithm, which does not require MSM, but its FRI protocol introduces a large number of hash calculations.

ZK Hardware Acceleration and Cysic's Ultimate Goal

While optimization at the software and algorithm level is very important and valuable, it has clear limitations. In order to fully optimize the efficiency of ZKP generation, hardware acceleration must be used, just like how ASICs and GPUs eventually dominated the BTC and ETH mining markets.

So the question is: what is the best hardware for accelerating ZKP generation? There are currently several types of hardware that can achieve ZK acceleration, such as GPU, FPGA, or ASIC, each with its own advantages and disadvantages.

We can compare these hardware options:

First, let's illustrate their differences in development with a simple example. For instance, if we want to implement a simple parallel multiplication:

  • On a GPU, using the APIs provided by the CUDA SDK, we can develop as if writing native code, thus gaining parallel computing capabilities;

  • On an FPGA, we need to learn hardware description languages to control hardware-level connections in order to implement parallel algorithms;

  • On an ASIC, the hardware-level transistor connections are fixed during the chip design phase and cannot be modified afterwards.

Each of these solutions has its own advantages and disadvantages, suitable for different stages of development in the ZK race. Cysic is committed to becoming the ultimate solution for ZK hardware acceleration, with a step-by-step strategy:

  1. Develop an SDK based on GPU to provide solutions for ZK applications and integrate GPU resources across the network;

  2. Utilize the flexibility and balanced characteristics of FPGA to quickly implement customized ZK hardware acceleration.

  3. Independently develop ASIC-based ZK Depin hardware.

  4. Cysic Network will integrate all the computing power of ZK Depin and GPU as a SAAS platform/mining pool, providing computing power and verification solutions for the entire ZK industry.

Next, let's fully understand the differences in the segmentation of ZK acceleration solutions and Cysic's development strategy by exploring multiple sub-race tracks.

ZK Mining Pool and SaaS Platform: Cysic Network

In fact, both well-known ZK Rollups such as Scroll and Polygon zkEVM have explicitly proposed the concept of "decentralized Prover" in their roadmaps, which is essentially the construction of ZK mining pools. This market-oriented approach allows ZK Rollup projects to reduce their burden and incentivize miners and mining pool operators to continuously optimize ZK acceleration solutions.

In Cysic's roadmap, there is a clear plan for a ZK mining pool and SaaS platform called Cysic Network. It will not only integrate Cysic's own computing power but also absorb third-party computing resources, including idle GPUs and zk DePIN devices, through mining incentives.

The entire verification workflow is illustrated as follows:

  1. The zk project party submits the proof generation task to the agent, whose job is to forward the proof task to the verification network. These agents will initially be operated by the official Cysic, and later asset pledging will be introduced to allow anyone to become an agent;

  2. The Prover accepts the proof task and uses hardware to generate ZK proofs. The Prover needs to stake tokens to participate in the proof task contracting and will receive rewards upon completing the proof task;

  3. The Verification Committee is responsible for checking the validity of the proofs generated by the Prover and voting. When a certain number of votes are reached, the proof will be considered valid. Validators join the committee by staking tokens, participate in voting, and receive rewards. This process can be combined with EigenLayer's AVS concept and reuse existing Restaking facilities.

The detailed interaction process is as follows:

In fact, there is a point in the above process. Whether it is asset pledging, incentive distribution, or the submission of computing tasks, it all depends on a dedicated platform, which requires a blockchain as a dedicated facility.

For this reason, Cysic Network has built a dedicated public chain, using a unique consensus algorithm called Proof of Compute (PoC). Its basic principle is based on the VRF function and the historical performance of the Prover, such as the availability of devices, the number of proof submissions, proof accuracy, etc., to select the block producer responsible for recording information of various devices and distributing token incentives.

Of course, in addition to ZK mining pools and SaaS platforms, Cysic has made extensive layouts in ZK acceleration solutions based on different hardware. Next, let's understand its achievements in the three paths of GPU, FPGA, and ASIC.

GPU, FPGA, and ASIC

The core of ZK hardware acceleration is to parallelize key operations as much as possible. From the perspective of hardware functionality, in order to achieve maximum flexibility and generality, a large part of the area in the CPU is used to provide control functions and various levels of cache, which results in weak parallel computing capabilities.

In the GPU, the proportion of chip area used for computation is significantly increased, allowing it to support large-scale parallel processing. GPUs are now very popular, and libraries such as Nvidia CUDA can help developers utilize the parallelism of GPUs without needing to understand the underlying hardware. Through the CUDA SDK, the CUDA ZK library can encapsulate and accelerate MSM and NTT operations.

FPGAs consist of arrays of a large number of small processing units. Programming FPGAs requires the use of specialized hardware description languages, which are then compiled into transistor circuit combinations. Therefore, FPGAs directly implement specific algorithms using transistor circuits without the need for compilation through an instruction system. This customization and flexibility far surpasses that of GPUs.

Currently, FPGAs cost only about one-third of GPUs and can be more than ten times more energy-efficient than GPUs. This significant energy efficiency advantage is partly due to the fact that GPUs need to be connected to host devices, which typically consume a lot of power. It can be said that FPGAs can add more computing modules to meet the demands of MSM and NTT without increasing energy consumption. This makes FPGAs particularly suitable for ZK proof scenarios that are computationally intensive, require high data throughput, and low response times.

However, the biggest problem with FPGAs is that there are very few developers with programming experience. For ZK project parties, organizing a team with expertise in both cryptography and FPGA engineering is extremely difficult.

ASIC is equivalent to implementing a program entirely in hardware. Once designed, the hardware cannot be changed, and correspondingly, the programs that ASIC can execute cannot be changed and can only be used for specific tasks. The hardware acceleration advantages of FPGA in MSM and NTT also apply to ASIC. Due to its dedicated circuit design, ASIC is the most efficient and least energy-consuming among all solutions.

For the current mainstream ZK circuits, Cysic hopes to achieve proof generation times of 1-5 seconds, and only ASIC can achieve this goal.

While these advantages sound very attractive, ZK technology is rapidly evolving, and the design and production cycle of ASICs typically takes 1-2 years, with costs reaching 10-20 million USD. Therefore, it is necessary to wait until ZK technology is stable enough before investing in large-scale production to avoid producing chips that quickly become obsolete.

In the fields of GPU, FPGA, and ASIC, Cysic has made comprehensive layouts;

At the GPU acceleration solution level, with the emergence of various new ZK proof systems, Cysic has adapted them based on its self-developed CUDA acceleration SDK and has linked tens of thousands of top-level GPU cards in Cysic's GPU computing power network through community resource aggregation. Additionally, the Cysic CUDA SDK accelerates by 50%-80% or more compared to the latest open-source frameworks.

In the FPGA domain, Cysic has completed the fastest implementation of modules such as MSM, NTT, and Poseidon Merkle tree, covering the most important parts of ZK computation. This solution has been validated through prototypes by several top ZK projects.

Cysic's self-developed SolarMSM can complete a scale of 2^30 MSM calculations in 0.195 seconds, while SolarNTT can complete a scale of 2^30 NTT calculations in 0.218 seconds, making it the highest-performing FPGA hardware acceleration result among all publicly available results.

In the ASIC domain, although large-scale application of ZK ASICs is still a certain distance away, Cysic has already made early layouts in this race and has launched the self-developed ZK DePIN chip and device.

To attract C-end users and meet the performance and cost requirements of different ZK project parties, Cysic will launch two ZK hardware products: ZK Air and ZK Pro.

ZK Air is similar in size to a power bank or a laptop power supply, and ordinary users can directly connect it to a laptop, iPad, or even a mobile phone via a Type-C interface to provide computing power support for specific ZK projects and receive rewards. Currently, the computing power of ZK Air still surpasses consumer-grade graphics cards and can accelerate small-scale ZK proof generation tasks.

ZK Pro is similar to a traditional mining machine, achieving the effect of a GPU server interconnected with multiple top consumer-grade graphics cards, significantly accelerating ZK proof generation, suitable for large-scale ZK projects such as ZK-Rollup and ZKML (Zero knowledge machine learning).

Through these two devices, Cysic will ultimately build a stable and reliable ZK-DePIN network. These two devices are still under development and are expected to be launched in 2025.

In addition, through Cysic Network, C-end users can join the zk hardware acceleration market with very low barriers to entry. Coupled with the high demand for computing power from ZK project parties, this may lead to a new wave of enthusiasm in the market, similar to the Bitcoin mining frenzy, and the market size of the ZK computing field may experience explosive growth once again.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

欧易返20%,前100送AiCoin保温杯
链接:https://www.okx.com/zh-hans/join/aicoin20
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink