Tether launched local AI, small models challenging cloud-based large models.

Can QVAC create a powerful enough model that users are willing to accept a moderate operational threshold for local autonomous control?

Written by: Liam Akiba Wright

Translated by: Luffy, Foresight News

Tether's new project QVAC begins with a concept that is quite rare among stablecoin companies. The company describes its QVAC Psy as a series of foundational large models "rooted in the principles of psychohistory."

The concept of psychohistory originates from Isaac Asimov's classic science fiction series "Foundation." In the story, protagonist Hari Seldon uses mathematics, statistics, and social dynamics to predict the behavior trends of large groups, thereby shortening the dark age following the collapse of the Galactic Empire.

The "Encyclopedia of Science Fiction" defines Asimov's psychohistory as a fictional science; Hari Seldon's entire plan aims to predict future events and preserve human knowledge and civilization during the collapse of social systems.

Tether's description essentially packages its corporate mission in the language of science fiction.

With its reserves, liquidity, and distribution capabilities, Tether has created the largest stablecoin system in the crypto industry; now, it is replicating this underlying logic in the field of artificial intelligence.

The USDT stablecoin forms Tether's primary reserve foundation; while computing power, AI models, datasets, and intelligent capabilities that can operate independently of centralized cloud services are becoming Tether's second major reserve asset.

From Dollar Reserves to Intelligent Asset Reserves

Tether's entry into artificial intelligence continues the operational logic of its core business. USDT transforms global offshore dollar demand into a reserve asset portfolio primarily composed of short-term sovereign bonds.

According to Tether's Q1 2026 reserve verification report, the company's net profit reached $1.04 billion, with reserve buffer funds totaling $8.23 billion, liabilities related to tokens about $183 billion, and direct and indirect holdings of U.S. short-term treasury bonds amounting to approximately $141 billion.

A robust reserve base provides Tether with sustainable revenue, ample balance sheet capacity, and strength to use operating income to layout long-term infrastructure tracks.

CryptoSlate previously analyzed that Tether, with its huge volume of stablecoins, could strategically allocate its reserve funds. In January this year, Tether invested in 8,888 bitcoins, corroborating its ability to convert interest income and operational profits into long-term bitcoin allocation needs. The QVAC project extends this asset allocation logic into a new track for artificial intelligence.

Now, in addition to layouts in Bitcoin, gold, startups, the energy sector, cryptocurrency mining, telecommunications infrastructure, etc., Tether is officially heavily invested in artificial intelligence itself. This positioning also allows Tether to transition from being merely a private issuer of dollar liquidity to a builder of private digital infrastructure.

The science fiction narrative of "psychohistory" aptly aligns with this strategic direction, as Tether views artificial intelligence as a layer of civilization-grade underlying architecture, rather than just a common software track. QVAC's official information positions itself as an "infinitely stable intelligent platform," focusing on decentralized intelligent systems that prioritize local operation, aiming to benchmark against and replace centralized AI.

The vision of QVAC indicates that relying on centralized servers for all intelligent interactions not only results in slow speed and poor stability but also carries the risk of control and limitations; QVAC aims to become the foundational base for user-specific intelligent systems at the edge.

This concept resonates with Tether's stablecoin philosophy. Fund circulation without permission, user data retained in control, and artificial intelligence operating locally.

Underlying Asimov's science fiction concept is Tether's more serious judgment: only when artificial intelligence possesses infrastructure-level resilience and risk resistance can its value truly solidify.

Although cloud-based large models have stronger integrated capabilities, they inherently carry platform risks, pricing risks, policy regulatory risks, network latency risks, and data routing risks; local AI models may sacrifice some performance but gain ownership, privacy, and sustained stable availability.

This trade-off logic aligns closely with the concepts of the crypto industry. Self-custody, though not as convenient as exchange custody until the risks of exchange failures emerge, only then do people understand its value; local AI, though not as easy to use as cloud-hosted models, will reveal its advantages once network interruptions, API changes, account bans, and data flow issues occur.

QVAC: A Unique Edge AI Architecture

The core differentiation of QVAC lies in its underlying architecture. Top large models like OpenAI, Anthropic, Google DeepMind, and xAI are competing for general capabilities, coding capabilities, multimodal interactions, super-long context reasoning, intelligent applications, and enterprise cloud deployment.

In contrast, QVAC has chosen a completely different path: deployability, privacy protection, low latency, composability, and independent existence away from a single platform.

QVAC's official introductory documentation defines the project as an open-source, cross-platform ecosystem that prioritizes local operation and peer-to-peer AI applications, compatible with Linux, macOS, Windows, Android, and iOS systems. Users can run large language models, speech recognition, retrieval-augmented generation (RAG), and other AI tasks locally and can leverage built-in P2P functionalities to delegate inference tasks to other device nodes.

This means that QVAC's benchmark standards are entirely different from those of top cloud AI large models: leading AI pursues the strongest general model capabilities that centralized services can provide; QVAC, however, focuses on the location of inference, operational control, whether data is retained locally on devices, and whether applications can continue to operate once centralized services fail.

Tether will launch the QVAC software development kit (SDK) in April 2026, providing a unified development suite that supports developers in building, running, and fine-tuning AI applications on any device, compatible with all platform systems without needing to modify code.

The QVAC SDK is compatible with various local inference engines based on a unified abstraction layer, including the self-developed QVAC Fabric, the llama.cpp branch version, and integrates tools such as whisper.cpp, Parakeet, and Bergamot for speech and translation.

It has already transcended the scope of single model releases, resembling a fundamental operating system for artificial intelligence. The open-source AI ecosystem now boasts a wealth of mature components: local inference projects like Llama, Qwen, Mistral, Gemma, DeepSeek, Hugging Face, llama.cpp, Ollama, etc., are flourishing.

QVAC's core bet is that developers urgently need a complete edge framework that integrates the entire process of model loading, inference computations, speech recognition, OCR text-image recognition, translation, text-to-image generation, retrieval-augmented generation, P2P model distribution, delegated inference, and local fine-tuning through a unified interface.

QVAC aims to become the underlying distribution layer for intelligent computing power, leveraging continuously iterated mid-end local models to seize entry into the edge AI ecosystem.

QVAC Fabric is the core of the entire technical architecture. Tether states that Fabric can leverage Vulkan and Metal backends to perform model fine-tuning on mainstream consumer hardware, adapting to Android devices with Qualcomm Adreno and ARM Mali graphics cards, Apple’s self-developed chip devices, and Windows and Linux computers equipped with AMD, Intel, and Nvidia hardware.

It also employs dynamic chunking technology to address mobile device memory limitations and supports GPU-accelerated LoRA fine-tuning processes and masked loss instruction optimization.

If this workflow can be verified through external developer testing, its value will far exceed that of ordinary open-source model releases: model weights are just the basic layer, while localized personalized fine-tuning adaptations are the core increments.

MedPsy: QVAC's First Hard Power Test

MedPsy is the first benchmark model product implemented by QVAC. The technical report published on May 7 on Hugging Face indicates that QVAC MedPsy is a medical health language model specifically designed for edge deployment, available in two versions: 1.7 billion parameters and 4 billion parameters.

The official claims a disruptive conclusion: a small model trained through strict medical specialization can outperform large medical benchmark models, while adapting to laptops, high-end mobile devices, and even smartphones.

QVAC states that MedPsy-1.7 billion parameters scored 62.62 points on seven closed medical benchmark tests, far exceeding Google MedGemma-1.5-4B-it's 51.20 points, with a parameter scale less than half that of the latter; MedPsy-4 billion parameters scored an average of 70.54 points, slightly ahead of MedGemma-27B-text-it's 69.95 points, with a parameter scale only one-seventh that of the latter.

In the HealthBench and the more difficult HealthBench Hard tests, the gap widened further: MedPsy-4B secured 74.00 points and 58.00 points, while MedGemma-27B-text-it scored only 65.00 points and 42.67 points.

If these scores can be replicated by third parties, it will directly validate QVAC's core idea: lightweight edge models can challenge oversized cloud systems in specific high-value vertical fields.

The training process also highlights QVAC's competitive thinking: MedPsy uses the Tongyi Qianwen 3 as the backbone model, with multi-stage supervised fine-tuning and medical Q&A reinforcement learning iteration optimization; over 30 million synthetic data points were generated in the experimental process, employing a dual-phase curriculum training, and using the Baichuan M3-235B large model as the long-text reasoning supervising teacher model.

Currently, its training corpus has not been made public, which poses a key concern: outstanding benchmark scores mostly come from QVAC's internal evaluations, and the potential contamination of training data, coverage scope, prompt word construction, and influence of the teacher model need external validation.

Quantization deployment advantages are prominent, with the official releasing a GGUF quantized version compatible with llama.cpp and QVAC SDK, which employs Q4_K_M quantization to reduce the model size by 69%, while achieving an average score loss of less than 1 point. Under the optimal balance of size and performance, the 4 billion parameter model is only 2.72GB, and the 1.7 billion parameter version is merely 1.28GB, easily deployable on local devices.

QVAC officials have also made clear risk warnings, stating that MedPsy only supports text interaction, is limited to English use, is not suitable for clinical emergency scenarios, has inherent hallucination issues of large models, and requires developers to ensure user privacy and security within the entire application architecture.

The medical field itself has a strong demand for local inference, and the prospects for MedPsy are promising; however, only when external researchers replicate benchmark scores and test them in real clinical processes can its capabilities be fully validated.

Convenience vs Control: The Ultimate Game in the AI Industry

The debate between local AI and cloud AI is often simplified to a choice between privacy and performance. However, QVAC reconfigures this logic, essentially involving a trade-off between convenience and autonomy of control.

Cloud AI excels in extreme ease of use, allowing users to open applications, input commands, and receive results without worrying about model weights, device memory, quantization parameters, vector embeddings, and environmental compatibility—platforms handle all technological complexities. This extreme convenience is also the core reason why centralized AI platforms can rise rapidly, enabling users to enjoy top-tier intelligent capabilities at a very low threshold.

In contrast, QVAC requires developers and users to take on more operational responsibilities in exchange for a new security architecture: local offline operation, offline availability, reduced data leakage, escaping API dependency, while also enabling direct peer-to-peer inference and model distribution channels.

According to Tether SDK information, applications powered by QVAC can operate stably in weak network environments, and even under offline conditions, artificial intelligence can continue to function normally. By 2025, early announcements for QVAC further outline that AI entities can be directly deployed on local devices, with devices collaborating through a P2P network, and with the WDK suite, AI entities can autonomously conduct Bitcoin and USDT asset transactions.

This is precisely Tether's complete top-level logic, where funds, computing power, and intelligent entities follow the same model of autonomous sovereignty design.

Of course, its decentralization narrative is not without flaws. From the perspective of users being able to download models themselves, run them locally, and retain sensitive data on their devices, QVAC has achieved a high level of decentralization in inference, unlike hosted APIs that no longer control every interaction instruction from the platform. Relying on the Holepunch network architecture, QVAC also supports delegated inference, decentralized model distribution, and other P2P foundational capabilities, and the architectural design showcases substantial innovation.

However, there are still centralized attributes on the governance front. QVAC is fully funded by Tether, with the naming coordination, market promotion, flagship applications, model systems, SDK roadmap, and "stable intelligence" concept all being led by a single entity.

This situation does not conflict with its core value of local priority; it merely confines the advantages of decentralization to the inference operation layer where evidence is most robust; the entire ecosystem still needs to gradually establish a distributed governance mechanism in aspects such as default registered nodes, version release channels, security standards, model entry points, and long-term community governance.

Reproduction Tests Determine QVAC's Ultimate Height

Currently, QVAC's credibility entirely depends on third-party reproduction results. If MedPsy's benchmark scores can be replicated in external testing environments, Tether will truly implement the concept of "intelligent asset reserves": lightweight, open-source, and locally deployable vertical field models that can stand shoulder to shoulder with oversized cloud models in highly sensitive tracks.

Even if third-party tests narrow or even reverse the score gap, the value of QVAC's infrastructure still holds; only the narrative of model performance will weaken. The ultimate issue in the industry returns to the ancient laws of technology: ultimate convenience leads to power concentration, while autonomy requires operational costs.

This is precisely the value of Asimov's science fiction concept: the psychohistory in "Foundation" studies the evolutionary patterns of complex large systems under stress; while Tether imbues it with new meaning, focusing on how infrastructure can withstand centralized monopolies.

The science fiction narrative is vast, and the technology is still in its early stages, but the overall strategic logic is clear and coherent. Tether is relying on the ongoing cash flow of the world's largest stablecoin to build an AI architecture centered around local operation, peer-to-peer networks, open-source tools, and lightweight edge models, extending the concept of sovereign autonomy of stablecoins from the realm of currency to the realm of intelligence.

The industry no longer questions whether the stablecoin giant has the strength to layout AI. The answer is clear.

The real core question is whether QVAC can create powerful enough models and infrastructure that will encourage users to accept moderate operational thresholds for local autonomous control.

MedPsy is indeed the first quantifiable threshold. The third-party reproduction results will determine whether QVAC's psychohistorical narrative is ultimately merely a science fiction metaphor or whether it genuinely enters the mainstream edge AI track as a complete operationally logical foundational architecture.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Tether launched local AI, small models challenging cloud-based large models.

From Dollar Reserves to Intelligent Asset Reserves

QVAC: A Unique Edge AI Architecture

MedPsy: QVAC's First Hard Power Test

Convenience vs Control: The Ultimate Game in the AI Industry

Reproduction Tests Determine QVAC's Ultimate Height

Selected Articles by Foresight News

Table of Contents

Related Articles