AI

AI Berkshire Released: Implementing Value Investing with AI Agents

AI Berkshire, now available on GitHub, uses Claude Code to replicate the value investing methods of Warren Buffett and three other experts. Featuring parallel analysis and anti-bias mechanisms, the AI reportedly achieved an annual return exceeding 66% in live operations.

6 min read Reviewed & edited by the SINGULISM Editorial Team

AI Berkshire Released: Implementing Value Investing with AI Agents
Photo by Declan Sun on Unsplash

A project aiming to embody the famous investment principle of Warren Buffett—“Price is what you pay, value is what you get”—through AI has been released on GitHub. Developed by xbtlin, “AI Berkshire” systematizes the value investing strategies of four legendary investors: Warren Buffett, Charlie Munger, Duan Yongping, and Li Lu. It operates as a network of AI agents built on Claude Code.

This project stands out as more than just a collection of investment analysis prompts. Its unique approach involves multiple AI agents working in parallel, integrating analysis results based on different investment philosophies, thus achieving a depth of insight that a singular large language model (LLM) cannot provide. The developer describes it as “a collaboration where one human and Claude become a single investment research team.”

Achieving Over 66%

Annual Returns in Live Operations

AI Berkshire goes beyond theoretical research and has disclosed its live investment performance. The reported annual return for 2024 was +69.29%, and as of 2025, it has achieved +66.38%. Compared to major stock indices during the same period, its performance exceeded the S&P 500 by approximately 46 points in 2024 and 50 points in 2025.

However, the project’s README file clearly states that “past performance does not guarantee future results,” emphasizing that it should be regarded as reference information for investment decisions. The inclusion of actual brokerage account screenshots further highlights the project’s commitment to transparency.

Distinctions from Conventional AI Analysis

When traditional LLMs are tasked with investment analysis, they often provide balanced, non-committal answers like “on one hand… on the other hand…”. AI Berkshire aims to overcome this “indecisive analysis.”

The first key difference lies in its mandatory output of a definitive conclusion. The framework requires a three-tier assessment—“approve/reject/gray zone”—and provides specific price ranges and investment strategies. For instance, a stringent rule known as the “Mirror Test” dictates that if an investment target’s business model cannot be summarized in five sentences, it should not be purchased.

Second, the system integrates conflicting viewpoints from the four investors. For each company analyzed, four independent agents evaluate the target based on their distinct investment philosophies. This approach naturally generates contradictions and tensions among the evaluations. For example, one agent adopting Buffett’s perspective might deem a stock undervalued, while another, using Li Lu’s approach, might recommend abstention due to long-term uncertainties. The developer asserts that such multi-perspective clashes are key to avoiding blind spots in investment decisions.

Anti-Bias Mechanisms

The third standout feature of AI Berkshire is its structured mechanisms to mitigate decision-making biases. A significant risk with LLMs is their propensity to present incorrect information as if it were accurate. To counter this, AI Berkshire incorporates multiple safeguards:

  • Information richness evaluation (A/B/C): Prevents excessive information from being mistaken for high certainty. If data is limited, confidence labels are assigned to estimates.
  • Munger-style counterfactual testing: Forces the model to enumerate scenarios in which the investment target could fail.
  • Immediate rejection list: Establishes eight pass/fail criteria; if even one is violated, the target is excluded from consideration.
  • Anti-consensus check: Evaluates risks from perspectives contrary to general market sentiment.
  • Reservation principle: Clearly marks data as “unknown” if insufficient information is available, avoiding presenting assumptions as facts.

These mechanisms prevent the AI from taking cognitive shortcuts, showcasing a design philosophy that goes beyond mere prompt engineering.

Accuracy in Financial Data

LLMs are widely known to struggle with numerical calculations. Errors in metrics like price-to-earnings ratio (PER) or currency unit confusion (e.g., Hong Kong dollars vs. Chinese yuan) could lead to catastrophic investment decisions.

To address this, AI Berkshire uses Python’s decimal.Decimal type for precise numerical calculations, avoiding floating-point errors. The framework incorporates a validation process where market capitalization, derived from stock price and shares outstanding, is manually cross-verified with reported values to ensure discrepancies remain below 0.1%.

Depth Through Parallel Agents

One of the most distinctive features is the parallel analysis conducted by four independent AI agents. Each is responsible for a specific domain—business model, financial evaluation, industry competition, and risks & management—and conducts its own web searches and data validation to draw independent conclusions. Ultimately, a “team leader” agent compiles these into an integrated report.

This approach enables the framework to process a volume of information that a single LLM context window could not handle. By operating four agents simultaneously, the system effectively gains four times the data sources and perspectives. The developer compares this to the difference between a single individual consulting an AI and four analysts conducting independent research before pooling their results.

Consistent and Reproducible Analysis

With traditional LLMs, the quality and format of outputs can vary with each query, even when input is consistent. AI Berkshire ensures structurally consistent outputs for identical inputs, enabling cross-company comparisons and longitudinal analysis of the same company.

For example, a published checklist evaluated seven companies using the same criteria. Kweichow Moutai and Tencent were rated as “approved,” NVIDIA and Meituan as “conditionally approved,” and Pinduoduo and Pop Mart as “in the gray zone.” Each stock was assessed on a five-point scale—circle of competence, business quality, economic moat, management, and margin of safety—and an integrated score was calculated for each.

Editorial Opinion

The emergence of AI Berkshire marks a new frontier in utilizing AI agents for the systematization of expertise and decision-making support. What stands out most is its design philosophy that goes beyond mere information summarization to enforce decisive judgment. While traditional AI applications have largely aimed to “support human decision-making,” this project takes a bold step by endowing the framework itself with decision-making criteria.

In the short term, AI Berkshire may serve as a concrete example of the practical application of AI agents in the financial industry, potentially inspiring similar projects. From a long-term perspective, the introduction of structured AI decision-making into the highly subjective realm of investment could have profound implications. However, it should be noted that there is no guarantee that backtesting using historical data published in 2018 will align with actual performance, and the framework’s resilience to changes in market conditions remains uncertain. Additionally, the automation of investment decisions could introduce new patterns of market volatility, which warrants careful consideration.

References

Frequently Asked Questions

How can AI Berkshire be used?
Users can download the code from the GitHub repository and set it up in an environment where Claude Code operates. By inputting the name of a target company for investment analysis, four AI agents will conduct parallel research and generate a consolidated report. Access to Claude Code and a Python runtime environment is required for use.
Can this project be used for actual investments?
The README file for the project makes it clear that past performance does not guarantee future results. While the structured analysis framework is valuable, it is not recommended to rely solely on AI for investment decisions. It should be regarded as a tool for supporting decision-making rather than a complete solution.
How does this project differ from conventional AI investment analysis tools?
The key distinction lies in its use of multiple independent agents working in parallel to conduct analyses based on differing investment philosophies, rather than relying on a single LLM for answers. Additional features include anti-bias mechanisms and rigorous numerical accuracy checks, specifically designed for investment decision-making.
Source: GitHub Trending

Comments

← Back to Home