Dialogue with the founder of the AI recruitment platform Mercor: How does AI evaluate people in recruitment, and what will humans be able to do in five years?

CN
14 hours ago

AI will soon dominate the talent assessment process, improving efficiency and accuracy, while humans will be more involved in the "sales" aspect.

Author: MD

Produced by: Mingliang Company

Recently, two partners from American venture capital firm Redpoint, Jacob Effron (center) and Patrick Achase (left), engaged in a discussion on the podcast "Unsupervised Learning" with Brendan Foody (right), founder and CEO of the AI recruitment platform Mercor. In addition to discussing the changes in Mercor's core business of AI recruitment, the three also explored the future relationship between AI and humans in the workplace.

Mercor was founded in 2023 by three 21-year-old Thiel Fellows, including Brendan Foody. In February of this year, the company announced the completion of a $100 million Series B funding round, with a valuation of $2 billion. The round was led by Felicis, with follow-on investments from Benchmark, General Catalyst, and DST Global. Mercor aims to enhance recruitment efficiency and reduce human bias by automating resume screening, candidate matching, AI interviews, and compensation management through AI technology.

In the interview, Brendan Foody mentioned that Mercor has actually entered the field of AI model evaluation and data annotation. As the capabilities of AI models improve, many complex issues can no longer be verified through the models themselves or common knowledge, thus requiring the assistance of highly knowledgeable professionals in specialized fields. However, such work is often not long-term positions, which aligns with the "expert network" lace, making it "natural" for their platform to find talent in this field for numerous AI laboratories. Foody pointed out, "The data annotation market is shifting from large-scale, low-barrier crowdsourcing to high-quality, expert-level annotation."

In its core business of "AI recruitment," Brendan Foody believes that AI is approaching or even surpassing humans in talent assessment through text, especially in scenarios like resume screening and interview text analysis, but AI still has shortcomings in multimodal tasks (such as emotional and atmosphere judgment).

Brendan Foody also mentioned a viewpoint: As future recruitment and talent assessment increasingly rely on rich contextual data, the integrity of feedback mechanisms and data input will directly affect the model's (assessment) effectiveness. For example, if hiring an investor, inputting their opinions from podcasts, daily meeting notes, and other data into the model to form context will undoubtedly be more beneficial for the model's judgment of this candidate's cognition, abilities, and job preferences. In traditional recruitment, such data assessments are either completely ignored or require a lot of effort, while AI achieves this at a lower cost and higher efficiency.

Therefore, the division of labor between AI and humans may evolve, with AI soon dominating the talent assessment process, improving efficiency and accuracy—while humans will be more involved in the "sales" aspect, such as communicating job atmosphere, incentives, etc., enhancing the candidate experience.

"I see the trend that in the future, humans will focus on creating assessments that allow models to learn things they cannot yet do, rather than repeatedly performing the same tasks," Brendan Foody said.

The following is the interview text compiled by "Mingliang Company" (with edits):

Jacob: Brendan Foody is the co-founder and CEO of Mercor, a company building infrastructure for the AI-native workforce market. The Mercor platform has been used for data annotation, talent screening, performance prediction, and assessing both human and AI candidates. This is a very interesting company, positioned at the intersection of recruitment assessment and improving foundational models.

Brandon's team recently raised $100 million, and they are working with some of the most advanced AI companies. Our conversation today covers many interesting topics, including the future role of humans in the workforce. We discussed which types of data annotation are most important for model improvement, Brandon reflected on Mercor's rapid rise and some key decisions he made, and we talked about where AI is effective and ineffective in the recruitment process. Overall, it was a very engaging conversation, and I believe you will enjoy it. Thank you, Brendan Foody, for joining our podcast.

Brendan: Thank you very much for the invitation. I am a big fan and very excited to be here.

Jacob: I'm glad you could come. I think we can start from the top down. For our listeners, I hope you can outline what stage we are currently in? What is the current state of AI talent assessment? What is effective, and what is ineffective? What progress has been made?

Brendan: I am surprised by its performance. I believe that for anything that humans can assess through text, the model is almost close to surpassing humans, whether it is the text records of interviews, written assessments, or signals on resumes. This is actually a very interesting binary opposition because these technologies are distributed very sparsely in the economy. So there is a large gap here, which is one of the things we are very excited to develop and build.

Jacob: Are there things that were unworkable before the emergence of reasoning models? For example, in the past six months, as these models have improved, what has finally started to work?

Brendan: Yes, I remember when GPT-4 was released, we built the first prototype of an AI interviewer, and it was ineffective. The model would encounter hallucinations and various issues every two or three questions. It has been a smooth journey since then. I think the emergence of reasoning models has clearly enhanced the model's knowledge capabilities, especially in processing large amounts of context, judging key points, and focusing attention.

However, the model is still not as strong in multimodal tasks because labs have not focused much on this in the past, and it is more challenging to do this with reinforcement learning, but we are also looking forward to progress in this area.

Jacob: What milestone functions are you most looking forward to the model achieving?

Brendan: There are some things, like what humans excel at, such as judging "vibe"—whether I would want to work with this person, whether this person is passionate, whether they are sincere—these models find it difficult to achieve. Even for the best humans, it is challenging, let alone for the models. So I am looking forward to breakthroughs in this area and am developing assessment tools for it. But whenever I read the model's reasoning chain, trying to interpret what we are assessing, I always feel that the model is much more rational than the researchers in our team responsible for creating the assessments.

So the progress of the model is really rapid; everyone can see their performance in the coding field, but we are actually just getting started, and many other fields are also taking off at an astonishing speed.

Jacob: A large part of what you do is actually designing assessments for humans to see if they can perform the job. Many people are now using AI employees, such as having AI agents complete tasks of employees. Are you involved in this area?

Brendan: Of course, we have done a lot in this area. To briefly introduce the background of the company, we founded the company because we felt that there are many talented people around the world who do not have opportunities, mainly due to the highly fragmented labor market. For example, remote candidates only apply for a few positions, while companies in San Francisco only consider a very small number of candidates because they have to manually solve the matching problem. By applying large models, we can solve this matching problem and establish a unified global labor market where every candidate can apply, and every company can hire. But later we found that with the emergence of new knowledge work positions, the demand for human resources surged, especially for talent to evaluate large models. So now we are recruiting various experts for top AI laboratories. These laboratories use our technology to assist, not only to create assessments for evaluation experts but also to create assessments for the models and those AI agents you mentioned.

Patrick: For our listeners, Mercor also uses a lot of AI for screening candidates, processing resumes, and other scenarios. Can you introduce some of your AI use cases? What does your current tech stack look like?

Brendan: A good approach is to create assessments for all the things humans do manually and see if we can automate them. For example, how humans review resumes, conduct interviews, rank candidates, and decide who to hire. We automate all these processes, such as assessing the accuracy of our resume parsing, scoring the accuracy of different parts of resumes, the accuracy of interview questions, and the accuracy of interview evaluations, and then input all of this into the model's context, along with recommendation letters and other data, to ultimately make hiring predictions.

Patrick: Mainly using off-the-shelf models, do you handle assessment and context design?

Brendan: Yes, we use many off-the-shelf models for foundational tasks, but in the most challenging final evaluation stage of candidates, we do post-training. We learn from client data, such as which individuals perform well and why, learning from these signals to make better future hiring predictions.

Patrick: What unexpected signals have you learned? For example, what has AI discovered that humans did not think of?

Brendan: There are many such examples. I believe one of AI's key advantages is its ability to analyze all the details of candidates more deeply, discovering small signals that humans sometimes overlook, while humans may have already made decisions based on "vibe judgment." For example, if someone shows great interest in a particular field purely out of interest rather than job necessity, this can become a signal. Or if someone has studied in the target country, they may communicate more smoothly and be better suited for the team environment. These small details vary by project and client.

Patrick: What do you think are the things that definitely need to be done by humans? You just mentioned multimodal tasks, but how do you see the collaboration between AI and human interviewers? Will it all be AI assessments in the future?

Brendan: Simply put, the recruitment process is divided into assessment and sales. The assessment phase will soon become very powerful, and everyone will find that AI recommendations are significantly more accurate, and people will be more willing to trust AI results. Humans will continue to play a significant role in the sales aspect, such as helping candidates understand the team, the position, the atmosphere, etc. AI allows recruiters and HR to focus only on the candidates they truly want, without wasting time interviewing unsuitable candidates. This enables them to better assist candidates in understanding the position, the team, and the incentives.

Patrick: Do you think people will start "gaming the system"—intentionally catering to assessment signals? Have you encountered this? For example, everyone claims to have studied in the target country.

Jacob: Everyone claims to have studied in the target country.

Patrick: Yes, for example, everyone says they studied in the recruitment location.

Brendan: Yes, so sometimes we need to keep signals confidential. Like all large recruitment processes, we often encounter this situation. The key is to ensure that assessments are dynamic enough, such as frequently changing questions or asking very in-depth questions based on the candidate's background. Because the model can prepare extensively for interviews, the depth and breadth of talent assessment are unprecedented.

For example, when I first interview an executive candidate, I might only look at a few minutes of their LinkedIn profile and some notes, but if I could listen to the podcasts they’ve been on, read their blogs or papers, and then ask questions based on those, the depth and detail would be completely different.

Jacob: Your model is very good at predicting candidate performance. Does this process require interpretability, or is it sufficient for the model to provide conclusions as a black box?

Brendan: I think interpretability is important for two reasons. First, it helps clients understand and trust the model's conclusions, establishing trust and a reasoning chain. Second, it ensures that the model selects candidates based on the right reasons. So, interpretability is very valuable.

But I believe the ultimate economic form may be API-based, where people need to complete tasks or require some human involvement, and as long as there is a confidence interval predicting whether this person can perform, the intermediary role of humans in the entire process will be greatly reduced.

Jacob: This is a trust milestone on the way to that goal, which makes a lot of sense. Currently, there is a clear feedback loop in the data annotation process, such as multiple people annotating the same data. How do you see the challenge of applying this method to more ambiguous human work areas? You might have to wait 15 years to get feedback.

Patrick: Like VC (laughs).

Brendan: One of my views is that if 100 people are doing the same job, it’s easy to rank them. But if the work done by those 100 people is all different, like founders, where each person's work is very different, it becomes difficult to find commonalities and judge which behaviors or information are related to outcomes. Because there are too many variables. So for large-scale homogeneous positions, like hiring 20 account managers, the model can learn signals from that and optimize. But for complex positions, like when we assess a group of Thiel Fellows, this situation is more challenging and relies more on the model's reasoning ability.

Jacob: What specific challenges are there?

Brendan: The main challenge is that a lot of information doesn’t get into the model's context, so the model cannot learn, and people often forget to provide additional information. For example, if I hear from a friend that a certain company's product is great, that information isn’t input into the model. Ensuring that all recommendation letters and interpersonal details are input is a major challenge. We found that as long as the necessary data is input into the model's context, most problems are solved.

Jacob: Maybe in the future, each of us will have smart glasses recording and inputting information into the model at any time.

Brendan: Right.

Jacob: Will it reach the level of Bridgewater Associates?

Brendan: Maybe. But many companies will resist this approach, unwilling to do so for legal and compliance reasons. However, I believe there will be better processes that allow the model to better acquire context. For example, AI conducting exit interviews, interviewing managers and team members to understand more details. People have a lot of detailed information in their heads; we just need to input that information into the model to make predictions that surpass human capabilities.

Patrick: More and more founders and various people are bringing AI to meetings, so many meetings and communications are being recorded for AI to learn. Very interesting.

Jacob: We can transcribe our meetings and have AI score and rank us.

Patrick: Haha!

Jacob: The premise is that I can rank at the top.

Patrick: What do you think of the current data annotation market? How do different players differentiate themselves? ScaleAI seems to be far ahead, but now there are many new players. What do you think of this landscape?

Brendan: I think most people do not understand the key changes in the data annotation and evaluation market. The market is completely different from two years ago. Previously, the models were not good enough and could easily be stumped, often making mistakes. High school or college students could do a lot of annotation or evaluation through crowdsourcing for large-scale data collection, such as SFT (Supervised Fine-Tuning) and RHF (Reinforcement Learning from Human Feedback), choosing different preference options.

But as the models have become very strong, the crowdsourcing model has failed, because you need high-quality talent to work directly with researchers to help them understand why the model performs well or poorly, designing complex data to challenge the model and reflect the real-world problems that need to be automated. Our platform can quickly recruit these high-quality talents.

This has allowed us to grow rapidly and collaborate with large laboratories. I believe this trend will continue. Those companies still stuck in large-scale crowdsourcing will encounter many troubles, while new players will focus on high-quality talent and continue to capture market share.

Patrick: Do you think the need for humans in the data annotation process will always exist? As models become stronger, even able to train smaller models, how do you see the evolution of the future?

Brendan: My view is that as long as there are things that humans can do in the economy that models cannot yet do, we need to create or simulate environments for models to learn. So some areas will be quickly conquered, like mathematics or coding, where the data volume is small and easy to verify, and models can solve them quickly. But some areas are very open, like assessing good founders or many knowledge-based jobs, which are essentially open-ended problems that are difficult to verify what is good, requiring human understanding to be input into the model. This is why I expect human data and the evaluation market to see an order-of-magnitude increase.

Jacob: If I understand correctly, your initial "arbitrage point" and company inspiration was that there are excellent programmers around the world, but they cannot access certain job opportunities, which is very important for programming data. You have clearly expanded into other areas, such as programming itself being a perfect use case for reinforcement learning and evaluation. What changes or improvements do you need to make when entering these more ambiguous areas and recruiting related talent?

Brendan: I think it’s a good practice to draw on heuristic methods that humans use manually. For example, if you want to automate the work of consultants, how do you evaluate consultants? Then give them case studies, perhaps related to their background.

Jacob: The people on your team might be very good at evaluating programmers, but if you want to bring doctors onto the platform, how do you know what heuristics to use to evaluate doctors?

Brendan: The point you mentioned is very interesting; when entering areas beyond the capabilities of machine learning teams, experts are needed. We need doctors to help us design assessments and evaluation standards for doctors, and the same goes for other fields. Similarly, this is also something researchers need to do. For example, it’s easy to judge which answer is correct for high school physics problems, but if it’s a PhD-level chemistry problem, researchers without relevant qualifications find it difficult to understand and improve the evaluation. So this is also one of the major changes in evaluation you asked about—whether assessing talent or researchers evaluating models, it will become a more collaborative process that requires working with experts to help the model progress.

Jacob: I’ve heard you say that this short-term data annotation contract work is actually the perfect entry point for your initial market, with huge demand, serving as a wedge into the end-to-end labor market. Can you talk about the path and phased goals for the company to achieve this vision?

Brendan: I wrote a "Secret Master Plan" that discusses this. My view is that the market network effect is very strong, which makes it both a moat and difficult to establish. So right now, we are very focused on capturing huge demand, expanding the network effect, and developing the market.

At the same time, we also see many large tech company clients needing a large number of contract workers, such as hundreds of data scientists, software engineers, etc. Although these positions are not directly related to human data, the essence of the demand is similar; it’s just a more traditional market, previously competing with companies like Accenture and Deloitte. We will take this as a second focus and then expand into full-time recruitment. But actually, what our company did early on was help friends and ourselves recruit contract workers, many of whom later became full-time employees.

So these businesses are continuous and have many commonalities. All companies want more candidates, faster hiring speeds, and higher confidence in competence. As long as we continuously measure and improve these metrics, we can serve every stage of company development well.

Jacob: Was there a moment that made you decide to shift to the human data field, feeling that the opportunity was particularly obvious?

Brendan: Yes, I encountered it while I was still in college. The company background is that I met my partners when we were 14 in high school, and we all started a business together at 18. They won many competitions, and I wasn’t as good as them, but I was always entrepreneurial. Later, we started recruiting international talent in India, such as collaborating with the IIT Code Club, and found that many smart people couldn’t find jobs. We thought we could hire them for projects, and friends were willing to pay us to help with recruitment. We earned small service fees and grew the company to a million dollars in revenue, making $80,000 after deducting salaries.

I was very proud, but my parents were still not satisfied. It wasn’t until we raised funds that they were satisfied. Back to your question, in August 2023, a client introduced us to the co-founder of x.ai when they were still in the Tesla office. He said Mercor had super engineers from India who excelled in math and programming. The next day, the founder of x.ai had a call with us, very excited. Two days later, we were in the Tesla office, meeting almost the entire founding team of x.ai, except for Elon, just before they had a meeting with him. We were still in college; it was incredible. We were all wondering why they wanted our product so much. Because the market was changing so fast, no one realized it. Now that we have grown and captured key market share, we are just starting to talk about these things publicly. But at that time, they weren’t ready to use human data; it was about six months later that we started collaborating with leading laboratories to scale the business.

Jacob: I saw the wave coming.

Brendan: Yes, I find that many founders are too demanding when looking for PMF; they should actually observe market signals and dig where there are gold mines. If initial sales are very difficult to achieve, scaling later will be even harder. You need to find the most painful, wealthiest customers who are willing to pay anything to solve their problems and then go all in.

Jacob: You have now surpassed programming; for example, the case of doctors makes me think that the standards for assessing good doctors will ultimately be used by model companies to train models to judge whether a doctor's reasoning process is correct. What do you specifically do when collaborating with clients?

Brendan: One key point where humans are currently stronger than AI is the ability to continuously learn and improve. We look for these proxy signals, such as candidates asking the right questions, having the right thought processes, and having experiences in high-performance environments in their backgrounds, which can help them identify the model's weaknesses and enhance its capabilities.

Jacob: Are you using your own products now? How specifically do you apply them in recruitment?

Brendan: Of course, we use them for all positions except executive roles. We do post executive roles as well, but mostly I interview first, mainly to sell the position rather than screen. Our AI interviews are very effective and often provide the most predictive signals. Many people underestimate the "vibe judgment" bias in the recruitment process; people always think their judgments are accurate.

Jacob: Recruitment is actually one of the earliest "vibe" industries.

Patrick: VCs definitely don't have this bias.

Brendan: So we use performance data to make decisions. For example, when we hire strategic project leaders, previously humans did case analyses, but now we conduct all interviews with AI, and the final conversion rate has even improved. AI interviews allow for more objective and standardized comparisons, eliminating the need for different interviewers to operate independently.

Patrick: In the evaluation phase, do you find people yourself, or do you use people from the market? Is a lot done internally?

Brendan: We use people from the market to conduct our evaluation processes, which are similar to client processes. Of course, researchers still need to participate to analyze the reasons for model errors, refine error classifications, and optimize training data, just like the process and staffing.

Jacob: You mentioned using multimodal capabilities to assess traits like passion. What are your thoughts on future video and audio applications?

Brendan: I often think about the role of reinforcement learning (RL) in enhancing video understanding capabilities. RL excels at searching for problems, and video contains a vast amount of information, making it challenging for models to process. We need to consider how to find key signals in a multimodal context, such as whether a candidate is excited or whether they are cheating. We need to create suitable data that allows the model to focus on these signals, and leading laboratories are also working on enhancing foundational capabilities.

Jacob: As you said, the annotation market has changed dramatically in just a few years. What do you think it will look like in two years? Will this business still exist, or will it only be left to experts?

Brendan: I believe it will be a very important area. The original intention of our startup was to aggregate labor to make labor allocation more efficient. The key is to determine the role of humans in the economy five years from now.

The trend I see is that in the future, humans will focus on creating evaluations that allow models to learn things they cannot yet do, rather than repeatedly performing the same tasks. Therefore, I am very optimistic about the transition of knowledge-based work to evaluation, which may take a more dynamic form, such as conversing with AI interviewers to solve problems. I believe this is an important component of the economy, but most people are not yet aware of it because they confuse it with the SFT and RHF markets, where the value of these two types of data is declining, and budgets are also decreasing.

Patrick: What skills do you think are most worth cultivating in the future? If you were to advise students on what to learn, what would you say?

Brendan: I would definitely recommend that everyone pursue rapid learning abilities because changes are happening too quickly. In many fields, people think models won't perform well for a long time, but breakthroughs happen quickly. Collaborating more with AI is essential. People in our market often say they enjoy working with models all day, thinking about what the models cannot do and what they lack. These experiences help them judge which aspects can be more efficiently handled by AI in actual work. So, it's important to use models as much as possible and become familiar with their strengths and weaknesses in their respective fields. This is very helpful, but it's hard to say whether one should become a software engineer or something else.

Jacob: Very interesting. In the future, we may all have to spend a lot of time training models. Hard skills have right and wrong answers, but subjective areas are almost limitless. Perhaps in the future, we can even work for our own dedicated models to earn money.

Brendan: I completely agree. I also suggest that everyone pay attention to fields with high demand elasticity. For example, in software development, there is a demand that is 100 times or 1000 times greater; even if it's not 1000 times for new web applications, there are still many iterations of features, sorting algorithm optimizations, etc. In contrast, the demand for accountants is quite fixed. Therefore, it's best to go into areas where demand will significantly increase and can enhance overall productivity, as this is safer.

Patrick: You're absolutely right. A few days ago, I was chatting with a founder who said everyone is talking about how software engineers will be eliminated, but I actually really need more software engineers.

Brendan: I'm excited too. If our software engineers' productivity increases tenfold, we might hire more software engineers. So the relationship between demand and price is always interesting.

Jacob: When you first started your company, there must have been temptations to create recruitment collaboration tools or software for agencies, right? Why did you decide to provide end-to-end services? Was this decision made from the beginning?

Brendan: At the beginning, we had a lot of first-principles thinking, which gave us an advantage because we hadn't seen traditional methods. We knew that the problem our friends wanted to solve was finding reliable software engineers, so we handled all aspects. But looking back now, I think more and more companies will move towards end-to-end solutions, because there’s no need to develop collaboration tools for a position that may disappear in the future; it makes more sense to automate the entire process so it can learn and optimize from feedback.

Jacob: Indeed, especially since the data labor market you are in is perfectly suited for end-to-end solutions while AI capabilities are still maturing. Without this market, you might have started with collaboration tools.

Brendan: Right, for full-time recruitment, clients definitely want employees under their own name. So we are fortunate that our operational model aligns closely with market demand shifts.

Jacob: Initially, you were helping friends find contract workers. Did you think of this as a side business at first, and then it became your main business? When did you decide to commit to full-time entrepreneurship?

Brendan: Actually, I have been entrepreneurial since high school, and the company was doing quite well, so I originally didn't want to go to college. I told my parents, and they weren't happy, but to appease them, I applied to college. However, I kept saying I would drop out, and they didn't believe me, thinking that since I promised to go to school, I wouldn't drop out. But I told them the same thing every semester, and in the end, I really did drop out without giving them advance notice because I had been saying it for two years.

Patrick: I knew you would drop out long ago.

Brendan: For me, I was very clear that I wanted to start a business and do something impactful, rather than take classes that felt useless. I was actually always looking for something worth investing in. My partner initially also treated it as a side business, wanting to gather enough evidence to convince his parents to let him drop out. Their parents' requirement was that he must successfully raise funds; even if the company already had a million dollars in revenue and profit, it didn't count unless he secured seed funding. So parents are the "investors" for VCs—only successful fundraising counts as having "credibility."

Jacob: Exactly, without parents, there are no VCs.

Brendan: That's the "authority endorsement."

Patrick: Speaking of fundraising, you recently (note: in February this year) completed a $100 million Series B funding round. Congratulations! How will this money be used? How do you determine when to raise funds?

Brendan: Actually, the only time we proactively raised funds was during the seed round to convince my parents to let me drop out. The Series A and B rounds were both "snatched" by investors. Our idea is to keep the dilution rate around 5%, building an "ammunition stockpile" for product development, such as recommendation incentives, innovative consumer products, and expanding the market supply side. We will also invest more in post-training data to enhance the model's performance prediction capabilities. The biggest bottleneck for our ML team is actually doing more evaluations and training environments, which aligns perfectly with our main business.

Jacob: Your client base includes many foundational model companies. What do you think about the future of this field? Some say there will only be two or three giants left. How many players do you think will ultimately remain? How will they differentiate themselves?

Brendan: That's a good question. I firmly believe that OpenAI will be a product company now and in the future, not an API company. Many API capabilities will become commoditized, and the key is how to deeply integrate with customer scenarios, which is the source of pricing power. But the market is large enough for each company to absorb significant value in niche areas. Even if there are laboratories focused on hedge funds, they can still earn huge profits. People like to use empiricism to say these companies are overvalued, but if you start from the first principle of "automating knowledge work," these top teams will definitely be able to create great companies.

Jacob: Now that models have strong cross-domain generalization, it feels like the winner takes all, but there will still be standout players in niche areas? The hedge fund example you mentioned is interesting, indicating that there is still a lot of space at the application layer.

Brendan: Yes, focusing is very valuable. I think creating a general API is not a good business; ultimately, there will only be one left. More value will be at the application layer, and each vertical field and customer scenario will require deep customization.

Jacob: Do you think these customized models will require a lot of complex annotations?

Brendan: Definitely. For example, each trading company can conduct evaluations tailored to their unique trading analyses, determining which conclusions are accurate and which are not, and whether they can be converted into profits. If there is a top-notch post-training team specifically optimizing trading analysis faster than human traders, the opportunities are enormous.

Jacob: It seems that some trading companies' optimal strategy should be to pause trading and spend nine months focusing on post-training models.

Brendan: I am actually surprised that many trading companies invest less in post-training than expected, possibly due to geographical reasons—they are mainly in New York, while laboratories and researchers are in San Francisco, and top researchers prefer to work on AI rather than just for profit. But I believe they will invest heavily, forming nine-figure and ten-figure collaborations with leading laboratories to customize their applications.

Jacob: What is the biggest unknown for you in the AI field right now? If you could know the answer, how would it impact company operations?

Brendan: It's still what you just mentioned, what humans will do in five or ten years. This is an extremely difficult question and part of the company's mission. We have various intuitions, but the world is changing too quickly. Many jobs will be automated, and we need to better understand the new opportunities and economic roles for humans in the future, which is very important.

Jacob: What do you think can still be done at the policy level? What role should other institutions play?

Brendan: Of course. Many regulators focus on issues that are actually far removed from ordinary people. I think in the next two or three years, people will genuinely worry that AI models are much better than humans at many jobs, and we need to find ways to integrate humans into the economy; this will definitely happen. This is not a low-probability, high-impact risk; it is an inevitable trend. Therefore, regulators should be more proactive in planning for the future, managing public expectations, and telling everyone what the world will look like in a few years.

Jacob: Indeed, right now, we can't even clearly define what retraining should look like.

Brendan: Exactly. But I hope there can be more discussions on this front, with more focus on the forms of work for the next generation, providing more guidance for students and job seekers.

Jacob: We like to do a quick-fire round at the end of interviews, asking some broad questions, and we’d love to hear your brief thoughts. What do you think is overestimated and what is underestimated in the AI field?

Brendan: Good question. I think evaluations (E-vals) are severely underestimated. Although they are already quite popular now, I believe they are still greatly undervalued.

Jacob: The last bastion of human capability.

Brendan: I think what is overestimated is traditional data like SFT and RHF. Some companies have spent billions of dollars on this, which is actually unnecessary; spending should be reduced by an order of magnitude, and this trend will change.

Patrick: What viewpoints have changed for you in the AI field over the past year?

Brendan: Interesting. I have significantly advanced my timeline expectations for automated software engineering. I used to be skeptical about the timeline when researchers said, "AI can write PRs with a higher hit rate than humans," but now I believe it will be achieved later this year or in the first half of next year, and that will be very cool.

Jacob: Yes. In fact, two years ago, if someone said AI could have the capabilities it has now, everyone would think it would change the world, but now that it has been realized, it’s not as shocking. Do you think this will lead to a large-scale change in software engineering jobs, or just a 10% to 20% change?

Brendan: The key is still what we mentioned earlier about "demand elasticity." In the short term, I'm not worried about engineers losing their jobs because tools make them more efficient, and there will actually be more software to write. However, the nature of the positions will definitely change; those who understand products and the shortcomings of models will have a comparative advantage.

Patrick: Besides your company, which AI startup do you have the most confidence in?

Brendan: I have high hopes for OpenAI's coding capabilities, although this answer isn't very "contrarian." I also believe there will be a large number of customized agents in the future, and there is a company in France that is still in stealth mode that I find very interesting.

Jacob: Well, you definitely can't say that on the podcast; we'll pressure you to reveal it after we finish recording (laughs).

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

Bitget:注册返10%, 送$100
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink