a16z led a $33 million seed round. How is Yupp reshaping the AI evaluation model based on blockchain and incentives?

CN
10 hours ago

Original Title: "A16z Leads $33 Million Seed Round, How is Yupp Reshaping AI Evaluation Models Based on Blockchain and Incentives?"

Original Author: ShenZhen, PANews

As AI applications penetrate various industries, accurately assessing model performance and enhancing user trust have become pressing challenges. Traditional evaluations often rely on centralized mechanisms, which struggle to cover diverse scenarios and fail to reflect real user preferences; meanwhile, the issue of model "hallucinations" frequently arises, leaving users trapped in information silos when making choices.

In this context, Yupp, as a new platform, is attempting to reshape the discovery, comparison, and usage of AI models through its unique crowdsourcing model and incentive mechanisms, bringing a paradigm shift to the AI evaluation field. This article will delve into Yupp's core mechanisms, technological highlights, team background, and its potential impact on the AI ecosystem.

Team Background and Funding: Backed by Tech Giants' Experience

Yupp is focused on addressing the long-standing evaluation challenges in the AI field, aiming to build a "trustless" AI feedback market—allowing diverse user feedback to flow freely under the protection of blockchain and crypto-economic incentives, thereby forming a scalable, fair, and transparent model evaluation layer. By incentivizing the distribution of high-quality human-annotated data, Yupp can promptly capture real user needs and preferences across different scenarios, helping AI developers optimize model performance iteratively.

The project was founded in June 2024 by Pankaj Gupta (Co-founder and CEO) and Gilad Mishne (Co-founder and Head of AI), with Chief Scientist Jimmy Lin (Professor at the University of Waterloo) also part of the core team. The three had previously worked together at Twitter in 2010, where they built and optimized large-scale recommendation and search systems, later accumulating rich experience at Google and Coinbase.

Due to its vision of decentralization and data value transparency, which aligns with AI vendors' dual demands for trustworthy evaluations and user participation, along with the core team's extensive backgrounds, Yupp has gained high recognition from notable figures in the tech industry and top venture capitalists.

Last week, Yupp announced the completion of a $33 million seed round, led by A16z partner Chris Dixon, with other investors including Google Chief Scientist Jeff Dean, Twitter co-founder Biz Stone, Pinterest co-founder Evan Sharp, Perplexity CEO Aravind Srinivas, Stanford's Dan Boneh, Chris Re, Nick McKeown, Balaji Prabhakar, and 45 other well-known angels and corporate executives, as well as Coinbase Ventures.

Core Features and User Experience: Building the "AI Parliament"

As a centralized AI evaluation platform, Yupp adheres to the philosophy of "Every AI for everyone," allowing users to easily discover, compare, and use the latest AI models. Unlike traditional single responses, Yupp returns answers from two (or more) models for each prompt, forming an "AI Parliament." This design not only meets users' needs for diverse choices but also effectively identifies potential "hallucinations" in models, helping users make more informed decisions through comparison. As Yupp CEO Pankaj Gupta stated, side-by-side outputs are particularly beneficial for users concerned about generative errors, as they can cross-verify results.

The platform currently supports over 500 AI models, covering text and image generation fields, including well-known models like ChatGPT, Claude, Gemini, DeepSeek, Grok, Llama, and many emerging models. To further optimize the experience, Yupp has also launched the "QuickTake" feature, which distills lengthy responses into concise tweets.

Additionally, Yupp places a high emphasis on user privacy: all chat records are private by default unless users choose to make them public; even when shared publicly, no personal information is disclosed. Users can control the content and scope of their sharing at any time.

Economic Model and Incentive Mechanism: Valuing Data Labor

Yupp combines free usage with user feedback through a "Yupp Points" system to measure model usage. New users receive 5,000 points upon registration and can earn more points by rating model responses, selecting preferences, and explaining their reasons. The higher the quality of feedback, the greater the rewards, ensuring users can sustainably use high-end models like Claude Opus 4 or OpenAI o3 for free. The platform promises that points will only increase and that all current models can be experienced for free.

After each question, users receive two model responses and can earn "digital scratch cards" through feedback, rewarding between 0 to 250 Yupp points. Every 1,000 points can be exchanged for $1, with a maximum daily withdrawal of $10 and a monthly cap of $50. Points can be exchanged for over 20 currencies, including USD and EUR, with partners like Stripe, PayPal, and Coinbase. At the same time, the platform integrates Base Ethernet L2 and Solana stablecoins to provide instant, fee-free rewards to global users.

As Pankaj Gupta stated, the high-quality feedback generated by users is far more valuable for AI companies' model fine-tuning and reinforcement learning than the rewards themselves. While users' monthly earnings may only amount to a few cups of coffee, this paid annotated data is crucial for AI iteration.

To encourage more participation, Yupp has also established a referral reward system: referrers earn 5,000 points, and referred users receive 1,000 points; currently, new registered users can receive 5,000 points, with referred users getting an additional 2,500 points.

Yupp VIBE Score: A New Paradigm for AI Evaluation

To address issues of insufficient transparency, fairness, and uneven access to evaluation data in existing rankings, Yupp has launched a beta version of the AI leaderboard and the "Yupp VIBE (Vibe Intelligence Benchmark) Score" system. This system aggregates preference data generated by global users in natural interactions, striving to provide robust and reliable evaluation results.

Yupp's evaluation principles include:

· Robustness: Ensuring representativeness (covering diverse scenarios), authenticity (reflecting user concerns), and resistance to cheating (withstanding malicious behavior);

· Trustworthiness: Fair and neutral (impartial to models), transparent and public (detailed disclosure of ranking algorithms), and rigorous and scientific (adhering to evaluation standards).

The platform not only collects binary preferences but also encourages users to point out the strengths and weaknesses of responses (such as "to the point," "fast," "good style," etc.), and conducts cluster analysis based on users' age, education, occupation, and other information to showcase preference differences among different groups.

On the technical side, Yupp is exploring the use of blockchain, cryptographic primitives, and zero-knowledge proofs to ensure the fairness, transparency, and verifiability of the evaluation process. At the same time, the platform has partnered with professional AI data providers to calibrate scorers through profile verification and multi-layer quality checks, eliminating malicious data.

The recent leaderboard has been updated, showcasing the VIBE scores and win rates, dislike rates, speed, latency, context window, and cost metrics for models like GPT-4.5 Preview, Claude Opus 4, and Claude Sonnet 4.

Development History and Future Outlook

Yupp officially launched on June 13, 2025, after six months of internal testing. Since its launch, the product has continued to iterate:

· Multimodal Support: Integrating models like Dall-E, Flux, Stable Diffusion, Luma Photon, Google Imagen 4, and supporting user-uploaded image/PDF queries;

· Expanded Interaction Methods: Adding voice input and voice reading features;

· Model Updates: Gradually introducing DeepSeek R1/V3, Mistral Small 3, OpenAI o3-pro, Hermes 3, Amazon Nova Pro v1, Microsoft Phi series, and "MAX model" categories;

· Real-time Information: Routing online query requests to Perplexity and Google Gemini Live, with accompanying hyperlink citations;

· Payment Upgrades: Adding support for US PayPal, Venmo withdrawals, and 24 currencies via PayPal;

· Share and Export: Supporting format-preserving copy, PDF/text/Markdown export, and sharing single responses or entire conversations on demand;

· Community Activities: Hosting events like the "AI Prompt Challenge," with prizes of up to tens of thousands of points; adding personal profile pages, AI auto-generated chat names, and other features.

Yupp's mission is to "empower humanity to shape the future of AI." Pankaj Gupta believes that the development of AI requires everyone's participation and contribution. Through multi-perspective AI responses and user feedback, Yupp not only helps users make better decisions but also provides a continuous driving force for AI evolution.

It is worth mentioning that one of Yupp's main competitors is the open AI model evaluation platform LMArena (website: https://lmarena.ai/), which is very popular among AI professionals, but the platform is currently in the commercialization exploration stage and does not offer direct material rewards or point incentive mechanisms for user participation through blockchain technology.

Overall, Yupp has opened a new path for AI evaluation with its crowdsourcing model, incentive mechanisms, and user preference-driven evaluation system. It not only provides users with free and diverse AI interaction experiences but also transforms user feedback into high-value training data, driving continuous model optimization. With an experienced team and top-tier capital backing, Yupp is poised to play a key role in the future AI ecosystem, realizing the vision of "AI for everyone, shaped by everyone."

However, for the newly launched Yupp, how to continuously ensure data quality, resist potential cheating behaviors, and strike a balance between commercialization and user incentives under large-scale user participation will remain directions that need ongoing exploration and optimization in its future development.

Original Link

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

ad
Gate: 注册赢取$10000+礼包
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink