From blindly saying "Yes" to seeing clearly before signing: How does Sigil add a safety barrier for AI Agents?

Imagine that in the future you only need to tell the AI Agent: "Help me allocate half of the available funds in my wallet to ETH."

The Agent immediately starts reading the balance, searching for liquidity pools, comparing quotes, and constructing a trading path. Seconds later, it sends you a message: "Found a suitable buying plan, do you confirm?"

You reply with a "Yes."

But at this moment, what exactly did you approve? Which trading pool did it choose, what are the expected execution price and slippage, which protocol was called, which wallet is used and how much asset is involved, and does it include token authorization or other additional operations? You didn't actually see this information; you just chose to trust the Agent's summary of the operation.

This is precisely a new type of risk that gradually emerges as AI Agents transition from "answering questions" to "acting on behalf of people": The Agent can browse the web, log into accounts, and even execute payments and on-chain signatures, but the authorization interface users ultimately face often remains just a vague chat message and a confirmation option that nearly lacks valid information.

A single "Yes" begins to determine your funds, data, and devices.

Thus, in the latest brand upgrade of imToken, beyond Store, Send, and Stake, a fourth S has emerged—Sign. If we say the first three S's correspond to asset custody, value flow, and network participation, then what Sign aims to solve is how users can continue to maintain the final right to be informed, authorize, and control when more and more software begins to act on behalf of users.

And Sigil is the first early exploratory POC product under the Sign proposition, with a very interesting core principle: What you see is what you sign— you sign what you see.

I. When the Agent begins to act, why does the wallet need to reinterpret Sign?

In the past, most of the signing risks faced by cryptocurrency wallets primarily stemmed from users not understanding the transaction content.

An on-chain transaction might only manifest as complex contract addresses, function parameters, and hexadecimal data at the underlying level; ordinary users find it difficult to determine whether it means a transfer/exchange or some more dangerous asset operation.

Therefore, wallets need to parse the raw data into information that people can understand, allowing users to see detailed information before signing (see extended reading "Ethereum strongly promotes 'What you see is what you sign': Why is Clear Signing a necessary capability patch in the AI era?"). Clear Signing, essentially "clear signatures" or "what you see is what you sign," is precisely aimed at bridging the gap between machine data and user understanding.

However, the issues brought by AI Agents are even more complex.

Because what users cannot see is no longer just a single on-chain transaction, but potentially an entire chain of operations automatically planned and executed by the Agent.

As stated earlier, an Agent, to achieve a goal like "Help me allocate half of my liquid funds to ETH," may need to read the wallet balance, search on-chain pools, call third-party tools, execute scripts, and complete transactions, and in this process, users cannot possibly inspect every underlying request one by one while still needing to make a final decision before the assets are actually exchanged.

Currently, many Agents use a method of authorization, which sends a brief description in the chat window and then waits for the user to reply "Yes" or click a regular button.

This method seems to complete user authorization, but there are still some obvious issues.

First, it is a black box; the user knows they approved something, but they may not know exactly how much value, which payee, and what the Agent ultimately signed on their behalf. The real operational parameters are hidden behind a highly summarized natural language, where the user merely confirms a vague intention rather than an actual action about to take place.

Secondly, a chat reply does not equal a digital signature. As long as someone can access a logged-in device—whether they have the phone, control the chat account, or operate directly next to the user, they could input a "Yes." The system can at most confirm that this message comes from a particular account, but it cannot verify that it was truly authorized by the account owner themselves.

More troubling is that the confirmation interface itself could also be forged. If the Agent can generate approval messages on its own, the party initiating the operation simultaneously controls the interface that displays the operation content to the user. It could completely omit critical parameters, use ambiguous language, or even display what appears to be a harmless operation while submitting another request in the background.

This creates a clear trust paradox: We want to restrict the Agent through the confirmation interface, yet we let the Agent decide what the user can see upon confirmation.

When the Agent is only responsible for summarizing articles or organizing information, this lack of transparency may only lead to incorrect answers. But when it begins to interact with accounts, funds, file systems, and terminal environments, the consequences of a vague approval could escalate from "inaccurate answers" to actual asset losses, data breaches, or device risks (see extended reading "Signing is not just signing: When AI Agent signs on your behalf, who retains the control?").

Therefore, the AI Agent era needs not more "Yes" buttons, but a signing mechanism that can demonstrate "what the user saw, what the user approved, and what the system ultimately executed."

II. Sigil: The signature shield situated between AI Agents and wallets

This is also what imToken's latest offering, Sigil, aims to achieve—to define itself as a safety barrier situated between the AI Agent and the wallet.

It does not attempt to stop the Agent from automating all tasks; on the contrary, users can clearly authorize the Agent during the initial setup, specifying which low-risk operations can be completed autonomously and which sensitive operations must pause, waiting for users to provide an independent, clear, and verifiable approval.

Within the established boundaries, the Agent can still act quickly.

But whenever a user marks an operation as sensitive, especially involving spending funds or signing transactions, Sigil will pause the process, parse the real request into a clear confirmation card, and send it to the user's Telegram. The user needs to complete the signing using Passkey and biometric recognition for the operation to continue.

Overall, the entire process can be summarized in four steps:

The Agent initiates the operation: It can continue browsing the web, booking services, sending requests, or preparing a transaction without any difference from the usual Agent's working method;
Judging whether to trigger the pre-set security policy: If it's a low-risk operation allowed for the Agent to complete autonomously, the process can continue; if it concerns sending messages, deleting files, running code, spending funds, or on-chain signing, etc., Sigil will pause execution and parse this request.
The user explicitly approves through Passkey: A clear confirmation card is sent to Telegram, displaying merchant, amount, payee, and other key parameters directly. What the user sees is not a description written by the Agent but structured content parsed from the actual operation.
Finally, only after Sigil's gateway verifies the user's signature can the Agent continue executing; without user approval, no funds or signatures will move;

The key to this mechanism is not just the addition of another biometric recognition step, but rather the re-establishment of the relationship between display, signing, and execution: What is displayed is the actual request, the user signs what is presented, and what the system ultimately executes must correspond to the signed request.

Once these three elements are inconsistent, Sigil will prevent the operation.

Ultimately, Sigil does not require users to approve every action of the Agent item by item. Instead, through policy settings, it allows users to decide in advance which behaviors can be completed automatically and which must be approved by them directly, and users can directly choose different security levels such as Relaxed, Balanced, or Strict; they can also enter Custom mode to set rules for each type of operation.

For example, in Balanced mode, certain low-risk behaviors can proceed without additional approval, while operations related to high asset safety concerning code execution or terminal commands must be confirmed by Sigil.

As for spending funds and signing transactions, regardless of what security policy the user chooses, personal approval is always needed.

This is a boundary that Sigil will not compromise on.

III. From Crypto to AI Agents, what does Sigil aim to protect?

Surrounding the principle of "What you see is what you sign," Sigil further provides three layers of protection.

First, users can accurately see what they are signing. For instance, in the confirmation card from Sigil, parameters such as protocol, amount, and payee are parsed into clear fields. Users do not need to trust the Agent's summary, nor do they need to face incomprehensible raw data.

This card itself represents the user's authorization content. For example, in the initial ETH transaction, what the user ultimately sees should not just be "Buy ETH," but should include the actual assets and amounts used, the transaction recipient, key transaction parameters, and other operational information that users need to comprehend.

In real payment scenarios, it similarly should not display just "Confirm Payment," but should clearly list the merchant, amount, and payee. After all, the closer the displayed content is to the actual operation, the more meaningful the user's authorization becomes.

Meanwhile, the only one who can effectively sign is the user themselves, because Sigil uses Passkey as the secure entry for approving operations and confirms the user's identity through device biometric recognition. Therefore, even if someone has access to a device logged into Telegram and can see the confirmation message, they cannot complete the approval merely by inputting a text or clicking a regular button.

In other words, the Passkey is bound to the user themselves, not to "the person currently holding the phone." It is worth mentioning that Sigil also adopts a design without mnemonic phrases, allowing users not to need to store or input an extra set of new mnemonic phrases, nor do they need to directly hand over the wallet private key to the Agent. The power to control approval remains with the user's own Passkey and biometric recognition.

Additionally, the confirmation page of Sigil is not just a regular message temporarily drawn by the Agent; it is a registered independent module whose contents are fixed on-chain and rendered in a sandbox environment. This means that the Agent cannot replace the page, modify the display logic, or forge a similarly appearing confirmation interface to lure the user into signing after initiating sensitive operations.

The party initiating the request no longer simultaneously controls the interface that displays the request, and with single sign-on, short validity periods, and binding request parameters through hashing, Sigil ensures that the content in the confirmation card corresponds with the final pending requests. This prevents signatures from being reused over a prolonged period and prevents request parameters from being quietly altered after user approval.

As long as the preview content is inconsistent with the actual request, the operation will be intercepted.

Therefore, viewed in this context, Sigil is not just a new wallet feature, but the first productized exploration that imToken undertakes regarding the Sign proposition, focusing on a more fundamental question: When the Agent begins to act, how can we ensure it continues to operate within the user's permitted range?

In the Crypto scenario, this demand is particularly intuitive—future on-chain Agents can assist users in completing periodic investments, yield management, expense payments, position adjustments, and risk monitoring, or even automatically execute operations across multiple protocols based on preset conditions, making it even more necessary to consider whether Agent behaviors can be immediately stopped when they deviate from user expectations.

Meanwhile, the significance of Sigil is not limited to Crypto; currently, whether it's OpenClaw, Hermes, or more Agents operating on personal devices and cloud environments in the future, they are progressively integrating email, instant messaging, calendars, files, browsers, terminals, payment tools, and various online services.

Although these operations might not necessarily occur on the blockchain, their underlying relationships remain fundamentally unchanged: the Agent calls upon a capability that belongs to the user in the user's name. Therefore, in the future, Sigil may also evolve from on-chain transactions to data access, identity use, file modification, content publishing, service purchasing, and automated tasks.

This explains why the capabilities accumulated by the wallet industry in the past may gain new value in the AI Agent era—private key management, digital signatures, identity verification, permission confirmation, and asset security primarily served on-chain transactions. Yet, the more fundamental problem they address has always been how to prove that an action has received legitimate authorization from a particular entity.

As Agents begin to act on behalf of people on a large scale, this capability has the opportunity to extend beyond the Crypto world to become an infrastructure for managing smart identities, automating tasks, and machine permissions.

Thus, as a joint exploration between imToken and OpenClaw, Sigil seeks to bring the experience accumulated by imToken over the past decade in self-custody, wallets, and digital signatures into a new phase as autonomous Agents begin to enter real execution environments.

It does not replace the Agent nor does it replace the wallet.

It stands between the two.

In Conclusion

Overall, AI is making the ability to act increasingly inexpensive.

What used to require users to switch repeatedly between multiple applications, going through searches, filling out forms, confirming and making payments, may in the future be accomplished by a single natural language command, automatically deconstructed and executed by the Agent.

However, "being able to act on behalf of users" and "having obtained effective user authorization" are always two different matters.

Because what truly determines whether a smart system is trustworthy is not just how many tasks it can accomplish, but whether users can always understand it, limit it, and stop it when necessary. From this perspective, Sign is not an unnecessary process that hinders the efficiency of Agents; rather, it might be the most vital layer of trust foundation before Agents truly enter assets and real services.

Store allows users to possess assets, Send enables free movement of value, Stake facilitates user participation in open networks, while Sign addresses how users can retain the final decision-making power when more machines begin to act on their behalf.

The value of Sigil is also in pushing this seemingly abstract control proposition towards a product that can be verified and continuously improved through real demos.

Let us stay tuned.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

From blindly saying "Yes" to seeing clearly before signing: How does Sigil add a safety barrier for AI Agents?

I. When the Agent begins to act, why does the wallet need to reinterpret Sign?

II. Sigil: The signature shield situated between AI Agents and wallets

III. From Crypto to AI Agents, what does Sigil aim to protect?

In Conclusion

Selected Articles by Odaily星球日报

Table of Contents

Related Articles