What is the Kelly criterion and how does it determine optimal bet size?

The Kelly criterion is a formula derived from information theory by John Kelly (1956) that determines the optimal fraction of capital to wager on a favorable bet. For simple bets, the formula is f* = (bp - q) / b, where b is the odds, p is the win probability, and q = 1 - p. For investments, the continuous version is f* = (mu - r) / sigma squared. It maximizes the geometric growth rate of wealth over repeated bets, meaning it produces the fastest long-run compounding.

Why do practitioners use fractional Kelly instead of full Kelly?

Practitioners use fractional Kelly (typically half-Kelly) for three key reasons identified by Thorp (2006). First, parameter uncertainty: the Kelly formula requires exact knowledge of expected returns and volatility, but these are estimated with error. Over-estimating your edge leads to catastrophic over-betting. Second, variance reduction: half-Kelly achieves 75% of the full Kelly growth rate but with only half the volatility. Third, drawdown management: full Kelly produces enormous portfolio swings, while fractional Kelly dramatically reduces maximum drawdowns.

What happens if you bet more than the Kelly-optimal amount?

Over-betting beyond Kelly is catastrophically asymmetric. Betting double the Kelly amount produces zero geometric growth, equivalent to not betting at all. Betting more than double Kelly produces negative geometric growth, meaning ruin becomes certain over enough time. This is far worse than under-betting: at half-Kelly you still capture 75% of the optimal growth rate. The asymmetry means estimation errors on the aggressive side are far more costly than errors on the conservative side, which is why Thorp and other practitioners strongly advocate fractional Kelly.

The Kelly Criterion: Optimal Position Sizing from First Principles

Key Takeaway

The Kelly criterion provides a mathematically optimal rule for sizing bets and portfolio positions to maximize long-run wealth. Derived from information theory, it tells you exactly what fraction of your capital to risk on any opportunity with positive expected value. Full Kelly maximizes the geometric growth rate of wealth, but practitioners almost universally use fractional Kelly (typically half-Kelly) because estimation errors and fat tails make full Kelly dangerously aggressive in real markets.

From Information Theory to Optimal Betting

In 1956, John Larry Kelly Jr., a physicist at Bell Labs, published a paper that would quietly reshape how serious gamblers and investors think about position sizing. Kelly was not studying finance; he was working on information theory, building on Claude Shannon's foundational work on communication channels. His insight was elegant: the problem of a gambler with an edge is mathematically equivalent to the problem of transmitting information over a noisy channel.

Kelly (1956) posed a simple question: if you have an edge in a repeated bet, what fraction of your bankroll should you wager each time to maximize the long-run growth rate of your wealth? The answer, now called the Kelly criterion, is surprisingly precise.

For a simple binary bet where you win with probability p and lose with probability q = 1 - p, and the odds pay b-to-1 on a win, the optimal fraction to bet is:

f* = (bp - q) / b

This formula has a beautiful interpretation. The numerator bp - q is your expected edge per dollar wagered. Dividing by b scales the bet size inversely with the payoff odds; higher payoffs require smaller fractional bets because each outcome carries more variance.

A Coin-Flip Walkthrough

Consider a coin that lands heads 60% of the time, paying even money (b = 1). Your edge is real but modest. What fraction of your bankroll should you bet?

Applying the Kelly formula: f* = (1 x 0.60 - 0.40) / 1 = 0.20

Kelly says to bet 20% of your current bankroll on each flip. Not 50%. Not 5%. Exactly 20%.

Why not bet more? Because over-betting destroys wealth through the mathematics of geometric compounding. If you bet 50% of your bankroll on a 60/40 coin, you will eventually go broke despite having a positive edge. The variance overwhelms the edge. After a sequence of wins and losses, your bankroll follows a path where the geometric mean of returns determines your long-run fate, not the arithmetic mean.

After 100 flips with Kelly-optimal 20% bets, the expected geometric growth rate is approximately 2% per bet. After 1,000 flips, your initial $1,000 would typically grow to over $300,000. With 50% bets on the same coin, you would likely have less than you started with.

Why Geometric Growth Matters

The Kelly criterion maximizes the expected logarithm of wealth, which is equivalent to maximizing the geometric growth rate. This distinction between arithmetic and geometric returns is fundamental to understanding why Kelly works.

Latané (1959) independently arrived at the same principle from a portfolio theory perspective, arguing that investors should maximize the geometric mean of portfolio returns. His reasoning was straightforward: over long horizons, the portfolio with the highest geometric growth rate will almost surely dominate all others.

The arithmetic mean of returns can be misleading. A portfolio that gains 100% and then loses 50% has an arithmetic average return of 25% per period, but the investor ends up exactly where they started. The geometric mean of (2.0 x 0.5) = 1.0, which correctly reflects zero growth.

This asymmetry between gains and losses is called variance drag. For any given arithmetic mean return, higher variance reduces the geometric mean. The relationship is approximately:

geometric mean = arithmetic mean - variance / 2

The Kelly criterion implicitly accounts for this drag. It finds the bet size that maximizes the arithmetic edge minus the variance penalty, yielding the highest geometric growth rate.

From Bets to Portfolios

For a single investment with expected excess return mu (above the risk-free rate r) and volatility sigma, the Kelly criterion takes a continuous form:

f* = (mu - r) / sigma^2

This formula has an intuitive structure. You invest more when the expected excess return is higher and less when volatility is higher. The optimal position size scales linearly with expected return but inversely with the square of volatility. Doubling volatility cuts the optimal position to one quarter, not one half.

Consider a stock with an expected return of 12%, a risk-free rate of 4%, and annual volatility of 20%. The Kelly-optimal allocation is:

f* = (0.12 - 0.04) / (0.20)^2 = 0.08 / 0.04 = 2.0

Kelly says to lever up to 200% of your capital in this stock. This result immediately reveals both the power and the danger of full Kelly: the theoretical optimum often demands aggressive leverage that most investors find terrifying, and for good reason.

The Case for Fractional Kelly

Edward Thorp, the mathematician who famously beat blackjack using card counting and later ran the enormously successful hedge fund Princeton Newport Partners, became the most influential advocate of the Kelly criterion in practice. But Thorp was equally emphatic about a critical modification: never use full Kelly.

Thorp (2006) argued that fractional Kelly, typically betting half the Kelly-optimal amount, is far superior in practice for several reasons.

First, parameter uncertainty. The Kelly formula assumes you know the exact probability of winning and the exact payoff odds. In reality, these parameters are estimated with error. Over-estimating your edge leads to over-betting, which is catastrophic. If your true edge is half of what you estimated, full Kelly based on the wrong estimate puts you at double the true Kelly fraction, deep into the danger zone where geometric growth turns negative.

Second, variance reduction. Full Kelly produces enormous swings in portfolio value. The standard deviation of the log-wealth path under full Kelly is surprisingly large. Half-Kelly achieves 75% of the growth rate of full Kelly but with only half the variance. For most investors, this trade-off is overwhelmingly favorable.

Third, drawdown management. The maximum drawdown under full Kelly is theoretically unbounded in continuous time. Under half-Kelly, expected drawdowns are dramatically smaller. Thorp documented that his own trading used Kelly fractions ranging from 0.1 to 0.5, depending on his confidence in the edge estimate.

The general fractional Kelly approach scales the optimal bet by a factor c between 0 and 1:

f_actual = c x f*

At c = 0.5 (half-Kelly), you sacrifice only about 25% of the long-run growth rate while cutting volatility by 50%. At c = 0.25 (quarter-Kelly), you sacrifice about 44% of the growth rate but reduce volatility by 75%. The growth rate under fractional Kelly is:

g(c) = c x (mu - r) - c^2 x sigma^2 / 2

This is a quadratic function that peaks at c = 1 (full Kelly) and equals zero at c = 2 (double Kelly). Betting more than twice the Kelly amount produces negative geometric growth; you will go broke with certainty over time.

The Danger of Over-Betting

The most important practical lesson from Kelly theory is the catastrophic asymmetry between under-betting and over-betting.

If you bet half the Kelly amount, you get 75% of the optimal growth rate. If you bet double the Kelly amount, you get zero growth rate, equivalent to not betting at all. If you bet more than double Kelly, your growth rate turns negative, and ruin becomes certain.

This asymmetry has profound implications. Estimation errors that cause you to under-bet are relatively harmless; you leave some growth on the table but your wealth still compounds positively. Errors that cause you to over-bet are potentially ruinous; the penalty for over-sizing is much steeper than the penalty for under-sizing.

This is why experienced Kelly practitioners invariably err on the side of caution. The cost of being too conservative is modest. The cost of being too aggressive is bankruptcy.

Multi-Asset Kelly: The Portfolio Version

For portfolios with multiple assets, the Kelly criterion extends using the covariance matrix. Thorp (2006) presented the multi-asset formulation, and MacLean, Thorp, and Ziemba (2011) provided the definitive textbook treatment.

If mu is the vector of expected excess returns and Sigma is the covariance matrix, the Kelly-optimal portfolio weights are:

f* = Sigma^(-1) x mu

This is identical to the mean-variance optimal portfolio with a risk-aversion coefficient of 1 (corresponding to log utility). The connection is not coincidental: maximizing the expected log of wealth is equivalent to mean-variance optimization when returns are normally distributed, with the specific risk aversion parameter that corresponds to Kelly.

The multi-asset formulation reveals that Kelly naturally diversifies. Assets with high expected returns receive large weights, but the covariance matrix ensures that highly correlated assets are not over-weighted. The portfolio version of Kelly is, in effect, an optimally leveraged version of the tangency portfolio from mean-variance theory.

Connection to Log Utility

The Kelly criterion is equivalent to maximizing expected log utility of wealth. An investor with a logarithmic utility function U(W) = ln(W) will, when optimizing a single-period portfolio problem, arrive at exactly the Kelly formula.

This connection provides theoretical grounding. Log utility has several appealing properties: it is the only utility function for which the optimal strategy is myopic (independent of the investment horizon), and it generates a growth-optimal portfolio that will almost surely outperform any other strategy in the long run.

However, log utility also implies a specific level of risk aversion. An investor with greater risk aversion than log utility implies should bet less than Kelly, which brings us back to fractional Kelly as the practical default.

Practical Portfolio Sizing Example

Consider an investor evaluating a systematic equity momentum strategy with the following estimated characteristics: expected annual excess return of 6%, annual volatility of 15%, and a risk-free rate of 4%.

The full Kelly fraction for this strategy is:

f* = 0.06 / (0.15)^2 = 0.06 / 0.0225 = 2.67

Full Kelly says to lever up to 267% of capital. This is aggressive. At half-Kelly (c = 0.5), the allocation becomes 133%. At quarter-Kelly (c = 0.25), it is 67%, which most institutional investors would consider reasonable.

The expected geometric growth rates are:

Full Kelly: g = 0.06 - (0.15)^2 / 2 = 4.88% annually above the risk-free rate

Half-Kelly: g = 0.5 x 0.06 - 0.25 x (0.15)^2 / 2 = 2.72% above risk-free

Quarter-Kelly: g = 0.25 x 0.06 - 0.0625 x (0.15)^2 / 2 = 1.43% above risk-free

The difference between full and quarter Kelly in growth rate is about 3.4 percentage points annually. But the volatility of the full Kelly portfolio is 40% (2.67 x 15%), while the quarter-Kelly portfolio has volatility of 10% (0.67 x 15%). For most investors, the risk-adjusted trade-off of fractional Kelly is clearly superior.

Limitations and Critiques

The Kelly criterion rests on assumptions that are imperfectly met in real markets.

Parameter uncertainty is the most fundamental problem. The formula requires precise knowledge of expected returns and volatility. In practice, expected returns are estimated with enormous uncertainty. A stock's expected excess return might be 6% plus or minus 8%. With such wide confidence intervals, the Kelly fraction itself is highly uncertain, and full Kelly becomes reckless.

Fat tails invalidate the continuous Gaussian approximation. Real market returns exhibit kurtosis far beyond what the normal distribution predicts. Extreme events occur more frequently than Kelly's mathematical framework anticipates. This makes over-betting even more dangerous than the standard theory suggests.

The non-ergodicity argument, advanced by Ole Peters (2019), provides a deeper critique. Peters argued that the standard expected utility framework conflates time averages with ensemble averages. For multiplicative processes like wealth growth, the time average (what a single investor experiences) differs from the ensemble average (what the average across many investors experiences). The Kelly criterion resolves this issue correctly by maximizing the time average (geometric growth rate), but Peters' work highlights that many conventional finance models implicitly optimize the wrong quantity.

Serial correlation in returns, transaction costs, and constraints on leverage and short-selling further complicate practical application. The Kelly fraction for a strategy with mean-reverting returns differs from the i.i.d. case, and ignoring this can lead to suboptimal sizing.

When Kelly Works Best

The Kelly criterion is most powerful in settings where the edge is well-characterized and the game is repeated many times. Card counting in blackjack, where Thorp first applied it, is the canonical example: the edge can be precisely calculated, the game is repeated thousands of times, and the distribution of outcomes is well-understood.

In financial markets, Kelly is most applicable to strategies with short holding periods and well-estimated edges: high-frequency market making, statistical arbitrage with large sample sizes, and systematic strategies with long track records. It is least applicable to concentrated long-term investments where parameter uncertainty dominates.

The enduring contribution of the Kelly criterion is not the formula itself but the framework of thinking it provides. Position sizing is not an afterthought; it is as important as the signal itself. The optimal bet size depends on the ratio of edge to variance, not on edge alone. Over-betting is far more dangerous than under-betting. And the geometric growth rate, not the arithmetic expected return, determines long-run wealth.

Geopolitical Risk Factor: Constructing a Hedged Portfolio

QD Research Originals14 min

Oil-Hedged Portfolios: Building a Futures Overlay for Energy Shock Protection

Portfolio Construction7 min

Regime Switching: Can We Detect When Markets Change?

Models & Frameworks14 min

The Black-Litterman Model: Blending Views with Market Equilibrium

Portfolio Construction12 min

This analysis was synthesised from Quant Decoded Research by the QD Research Engine AI-Synthesised — Quant Decoded’s automated research platform — and reviewed by our editorial team for accuracy. Learn more about our methodology.

References

Kelly, J. L. (1956). "A New Interpretation of Information Rate." Bell System Technical Journal, 35(4), 917-926. https://doi.org/10.1002/j.1538-7305.1956.tb03809.x
Latané, H. A. (1959). "Criteria for Choice Among Risky Ventures." Journal of Political Economy, 67(2), 144-155. https://doi.org/10.1086/257819
Thorp, E. O. (2006). "The Kelly Criterion in Blackjack, Sports Betting, and the Stock Market." In Handbook of Asset and Liability Management. https://doi.org/10.1142/9789812773548_0029
MacLean, L. C., Thorp, E. O., & Ziemba, W. T. (2011). The Kelly Capital Growth Investment Criterion: Theory and Practice. World Scientific. https://doi.org/10.1142/8042
Peters, O. (2019). "The Ergodicity Problem in Economics." Nature Physics, 15, 1216-1221. https://doi.org/10.1038/s41567-019-0732-0