From "Address Clustering" to "Standard of Evidence": Why Does Chainalysis Redefine Blockchain Tracking?

Author: 137Labs

At the end of June 2026, Chainalysis publicly announced a data framework called "Blockchain Tracing Ontology" to establish a more unified data description system for blockchain analysis. Compared to previous releases of new products or features, this document resembles an industry standard initiative: it attempts to redefine the fundamental concepts of on-chain data analysis and establish an interpretable, verifiable, and replicable data model for blockchain tracing.

Since the release of this proposal, it has quickly become a topic of interest in the fields of blockchain analysis and digital asset compliance. Although it is still in the stage of public discussion and industry initiative, it has prompted people to reconsider whether on-chain analysis requires a more unified and transparent data standard.

A Long-Standing Question: Why Do Different Companies Arrive at Different Analysis Results?

Blockchain data is inherently public and transparent, but there has always been a lack of a unified standard for interpreting this data.

Currently, most on-chain analysis platforms use "Address Clustering" technology to infer which addresses may be controlled by the same entity through transaction behavior. However, the algorithms, rules, and sources of evidence used by different organizations are not consistent, meaning the same address may correspond to completely different ownership results on different platforms.

For example, one analysis firm may believe that a certain address belongs to a large exchange, while another firm marks it as an unknown wallet; the same group of addresses may also be classified into different Clusters on different platforms. This discrepancy has limited impact on market analysis, but can lead to significant controversy when it involves judicial investigations, asset freezes, anti-money laundering efforts, or law enforcement documentation.

For the courts, merely stating "this is a wallet of an exchange" is far from sufficient; it is even more important to answer another question: Why can such a judgment be made?

Chainalysis Proposes Not a New Algorithm, but a "Language"

Many people see the term "Ontology" and mistakenly think Chainalysis has proposed a new clustering algorithm. In reality, this is not the case.

Ontology is a concept in the field of knowledge engineering that refers to a unified system of concepts and relational models used to standardize the definitions of different objects and their associations. Internet search, medical knowledge bases, and even artificial intelligence knowledge graphs widely use Ontology to ensure that data can be uniformly understood.

Chainalysis aims to establish a similar "common language" for blockchain analysis.

In other words, it is not about requiring all companies to use the same clustering algorithm, but rather hoping that everyone can express analysis results according to a unified data structure, making the analysis process more transparent and easier for third parties to understand, verify, and replicate.

"Cluster" Is No Longer Enough

In the past, the industry generally used "Cluster" as the basic unit of analysis, considering that multiple addresses collectively belong to one wallet or entity.

While this approach is simple and intuitive, its limitations have become increasingly apparent with the development of blockchain infrastructure.

Nowadays, a large exchange's wallet system may contain millions of addresses, each address serving entirely different functions such as deposits, withdrawals, hot and cold wallet management, aggregation, and change. If they are all simply classified as one Cluster, it becomes difficult to accurately describe the complex wallet structure.

Therefore, Chainalysis introduced the new concept of "Wallet Segment" in its proposal.

In the new model, one entity can have multiple wallets, each wallet can be divided into multiple Wallet Segments, and each Segment contains specific addresses. This hierarchical structure reflects the wallet management models of large institutions more accurately than traditional Clusters and allows for a more detailed description of the control relationships between different addresses.

From "Result Credibility" to "Process Credibility"

Compared to the model itself, a more significant change comes from the secondary design.

Traditional on-chain analysis focuses more on the final results—who owns the address, where the funds flow, and whether illegal activities are involved.

The new Ontology emphasizes the inference process itself.

For each analysis result, several questions should be clearly answered:

What on-chain evidence supports this conclusion?
What analytical rules were used?
Was any off-chain information referenced?
What is the credibility of this inference?
Can third parties independently verify this process?

In other words, it is crucial not only to tell others "what it is," but also to explain "why it is so."

Chainalysis refers to this part as the Evidence and Confidence layer.

In the future, when an address is marked as an exchange wallet, it will no longer be just a simple label, but will come with a complete basis for inference, including transaction patterns, address relationships, public information, investigation records, etc., and will provide a corresponding credibility level. Such a design aligns better with the interpretability requirements of judicial evidence and facilitates cross-validation among different institutions.

Insights from the Bitcoin Fog Case

In fact, this proposal did not arise out of thin air; it is closely related to the famous Bitcoin Fog money laundering case in the United States.

Bitcoin Fog was one of the longest-running mixing services in Bitcoin history. During the investigation, the U.S. Department of Justice significantly relied on the analysis results from Chainalysis Reactor as key evidence.

During the case hearing, the court held a well-known Daubert hearing, which rigorously reviewed the analytical methods of Chainalysis, including:

Whether address clustering has scientific validity;
Whether the analytical method can be independently verified;
Whether it is an inexplicable "black box algorithm";
Whether other experts can independently replicate the analytical process.

Ultimately, the court recognized that Chainalysis's analytical methods possess sufficient scientific reliability to be used as judicial evidence.

However, this case also exposed problems within the entire industry: if different analysis institutions adopt different standards, future similar cases may face more questioning. Therefore, establishing a unified data expression and evidence framework is an important background for Chainalysis to promote Ontology.

Blockchain Analysis Cannot Directly Identify Real Identities

It is important to note that Chainalysis particularly emphasized one point in this proposal: on-chain analysis itself cannot directly identify individuals' identities in the real world.

On-chain data can only reveal relationships between addresses and paths of fund flow. As for the true controllers behind the addresses, it typically still depends on off-chain evidence, such as exchange KYC information, data retrieved by courts, and server logs obtained by law enforcement agencies.

This means that blockchain analysis provides high-quality data inferences rather than serving as the final proof of identity. A truly complete chain of judicial evidence requires the combination of on-chain data and off-chain investigations.

From Data Quality to Industry Standards

In addition to the Ontology itself, the overall framework proposed also systematically elaborates on data quality, analysis transparency, and judicial admissibility. It is evident that Chainalysis hopes to drive the industry to focus not only on the analysis results themselves but also on whether the analysis processes can be explained, verified, and replicated.

This also suggests that the future competition in the industry may no longer be just about "who covers more addresses," "who identifies more labels," but rather "who has higher data quality," "who's analysis is more transparent," and "who's evidence is more readily accepted by courts."

For regulatory agencies, law enforcement departments, and large financial institutions, a system that can explain analytical logic, support independent audits, and has repeatable verification capabilities is clearly more trustworthy than a "black box model" that can only output results.

What Does This Proposal Mean?

From a longer-term perspective, what Chainalysis has released is not an ordinary software upgrade, but rather an advancement towards transforming the blockchain analysis industry from "experience-driven" to "standards-driven."

If this Ontology is ultimately widely accepted by the industry, different analysis institutions, exchanges, regulatory departments, and even judicial bodies may look forward to sharing analysis results under a unified data model, reducing communication costs and improving the consistency of evidence, providing a more reliable foundation for cross-border law enforcement, anti-money laundering investigations, and digital asset regulation.

Of course, the establishment of standards does not happen overnight. How to balance commercial secrets with transparency, how to promote different institutions to adopt unified norms, and how to continuously refine the evidence model still require collective exploration by the industry.

But it is certain that as digital assets gradually integrate into the global financial system, the competitive focus of blockchain analysis is shifting: in the future, what truly determines industry value is not only the accuracy of algorithms but also the interpretability of the analysis process, data quality, and evidence credibility. This is the new direction Chainalysis hopes to open with Blockchain Tracing Ontology.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。