Monte Carlo simulation is one of those techniques that looks simple enough to implement in an afternoon and then eats six months of your life once you realise how many subtle mistakes are hiding in the default approach. Most retail financial planning tools run a version of Monte Carlo that assumes normally-distributed annual returns, IID draws, and constant volatility. Every one of those assumptions is wrong in ways that matter for the output. This guide walks through how to do it properly for a real portfolio — retirement, endowment, or anything in between — and what to do differently once you care about the tails.
A Monte Carlo simulation is only as good as its return-generating process. If that process says "returns are normal with historical mean and variance," the simulation will understate the probability of portfolio failure by roughly 50% in the tails. You are answering the wrong question very precisely.
What the Simulation Is Actually For
The purpose of Monte Carlo is to estimate the distribution of outcomes when the underlying path matters. For a buy-and-hold investor with no contributions or withdrawals, the distribution of final wealth is fully determined by the distribution of total returns — a simulation is overkill. A closed-form calculation works.
The moment you introduce cash flows — contributions during accumulation, withdrawals during retirement, endowment spending rules, rebalancing — the path matters. Two identical portfolios with identical average returns can end up in completely different places depending on when the good and bad years fall. That is sequence-of-returns risk, and it is the single biggest reason to run a simulation at all.
The Default Tool Is Broken
Almost every retail-facing Monte Carlo tool does some version of this:
import numpy as np
def broken_mc(initial, years, mean=0.07, std=0.15, n_sims=10000):
returns = np.random.normal(mean, std, (n_sims, years))
paths = initial * np.cumprod(1 + returns, axis=1)
return paths
This is the "60/40 grows at 7% with 15% vol" model. It will produce clean, reassuring output that is wrong in at least three ways. First: equity returns are not normally distributed. They have fat tails, especially on the downside, and negative skew. Using a normal distribution dramatically underweights the probability of a 30% drawdown in any given year. Second: returns are not IID. Volatility clusters — bad months come in groups, not spread evenly across time. This affects drawdown distributions even when the average return is unchanged. Third: mean reversion matters over long horizons. Real historical equity returns exhibit slight negative autocorrelation at multi-year horizons, which tends to smooth out the very tails. A naive IID model will overstate the range of long-horizon outcomes.
These three issues do not all pull in the same direction. Fat tails make the downside worse. Mean reversion makes the long-horizon spread narrower. Volatility clustering makes drawdowns worse. The net effect depends on the horizon and the statistic you care about.
Fix One: Use the Right Distribution
For equity-like returns, a Student's t-distribution with 5-7 degrees of freedom is a dramatic improvement over a normal distribution and essentially costs nothing to implement. The t-distribution has fat tails that match empirical market data far better than the normal.
def t_dist_returns(years, mean, std, dof=6):
raw = np.random.standard_t(dof, years)
# Rescale so the sample has the intended mean and std
raw = (raw - raw.mean()) / raw.std()
return mean + std * raw
Even better is to draw directly from empirical historical returns — known as bootstrapping. Take the historical record of your asset classes, sample with replacement, and use those as the simulated returns. This automatically captures the empirical distribution including skew and fat tails. The tradeoff is that bootstrapping assumes the future looks statistically like the past at the monthly or annual level, which may or may not be a reasonable assumption depending on the horizon.
Fix Two: Block Bootstrapping for Volatility Clustering
Simple bootstrapping fixes the distribution but destroys the time-series structure of volatility. If September 2008 was a bad month, October 2008 was probably also a bad month, and the simple bootstrap breaks that link.
Block bootstrapping fixes this. Instead of sampling individual months, sample contiguous blocks of length k (typically 12 or 24 months). This preserves the short-range autocorrelation structure of volatility clustering while still generating fresh paths.
def block_bootstrap(historical, n_years, block_size=12):
n_months = n_years * 12
n_blocks = n_months // block_size + 1
starts = np.random.choice(len(historical) - block_size, n_blocks)
path = np.concatenate([historical[s:s+block_size] for s in starts])
return path[:n_months]
Block bootstrapping is my default approach for any Monte Carlo simulation where the horizon is longer than one year. It is simple, honest about what it assumes, and produces drawdown distributions that look empirically realistic instead of suspiciously smooth.
Tail risk: normal vs t vs block bootstrap
Modelled probability of >30% loss, by simulation horizon (years)Fix Three: Handle Correlated Assets
A real portfolio has more than one asset. Simulating each asset independently and then combining them will dramatically understate correlated downside risk — when equities crash, corporate bonds and REITs typically crash too, even if their long-run correlation with equities is moderate.
The cleanest fix is to block-bootstrap joint historical observations. Sample a historical month once, and take the returns of all assets in that month together. This preserves the full historical correlation structure automatically, including the fact that correlations rise in crises.
Modelling Withdrawals and Contributions
Now apply cash flows. For a retirement portfolio, the standard framing is: given initial wealth W₀, a withdrawal schedule W(t), and a portfolio return path R(t), does the portfolio last through the horizon?
def retirement_sim(initial, returns, withdrawals, inflation=0.025):
wealth = initial
history = [wealth]
for t, r in enumerate(returns):
# Inflate withdrawal
w = withdrawals * (1 + inflation) ** t
# Return first, then withdraw (conservative ordering)
wealth = wealth * (1 + r) - w
if wealth < 0:
return history, False
history.append(wealth)
return history, True
Run this 10,000 times with bootstrapped return paths and you get a distribution of outcomes. The headline statistic is usually the "success rate" — the fraction of simulations where wealth remains positive through the horizon. A 95% success rate on a 30-year retirement simulation is generally considered safe. 90% is borderline. Below 90% and you are gambling.
Sequence-of-Returns Risk
This is the statistic that matters most for retirees and is absent from every summary output in a broken Monte Carlo. Sequence-of-returns risk captures the fact that bad returns early in retirement are dramatically more damaging than the same bad returns later, because you are drawing down from a shrinking base.
Two paths with identical average returns can produce completely different outcomes. A path with bad years in years 1-3 and good years in years 20-30 can fail entirely, while the reverse path leaves you comfortably ahead. A good simulation reports not just the median and success rate, but also the conditional distribution of outcomes given bad first-five-years returns. That is the number that should drive your spending decisions in the first few years of retirement.
The Four Headline Statistics
A real Monte Carlo output should report at minimum:
- Success rate — the fraction of simulations where wealth survives the horizon without going negative.
- Median terminal wealth — the middle outcome, in today's dollars.
- 5th percentile terminal wealth — the bad-but-not-worst outcome. This is the number you plan around.
- Maximum in-simulation drawdown — the worst peak-to-trough decline across all paths. This is the number that tells you whether you can psychologically survive the strategy.
A naive tool will give you "median terminal wealth" and "success rate" and stop there. Those are not enough to make real decisions.
Terminal wealth distribution
$M, 30y horizon, 4% withdrawalSuccess rate is not the same as probability of success. Success rate is a Monte Carlo output conditional on the assumptions you gave it. If your assumptions are wrong — if the future real return of global equities is 3% instead of 6%, for example — your success rate is meaningless no matter how many simulations you ran.
Common Pitfalls
Using nominal rather than real returns. A 7% nominal return with 2.5% inflation is a 4.5% real return. If you are comparing to real-world withdrawal needs that grow with inflation, model real returns throughout.
Ignoring taxes. The difference between pre-tax and after-tax withdrawal needs can easily be 20-30% for a high-bracket retiree. A Monte Carlo that models gross returns and gross withdrawals will produce systematically optimistic results.
Too short a historical sample. If you are bootstrapping from the last 20 years of data, you are sampling from a low-inflation, high-equity-return regime. Extend the history to at least 50 years, preferably including the 1970s stagflation period, before trusting the output.
Over-optimising the spending rule. If you tune a dynamic spending rule to maximize simulated wealth, you will overfit to the historical sample. Check robustness to alternative return assumptions before committing to a rule.
Reporting only the headline. A single number (success rate) cannot capture a distributional output. Always report the bands.
The Endowment Variant
For an endowment or perpetual portfolio, the simulation framing flips. Instead of "does the portfolio last the horizon," the question becomes "what is the distribution of real spending the portfolio can support forever?" The standard approach is to simulate the portfolio under a proposed spending rule (e.g., 4% of the three-year trailing average of wealth) and report the distribution of real spending over time, the probability of spending declining by more than 20% in any five-year window, and the probability of real wealth erosion over the full horizon.
These are all decision-relevant statistics. They are the reason endowment committees meet. A Monte Carlo that cannot answer them is not a real tool.
The Honest Summary
Monte Carlo is a powerful technique that is easy to abuse. Done well, it produces a probabilistic answer to a question that has no deterministic answer. Done poorly, it produces a false sense of precision that can cause real people to make bad decisions about retirement, endowment spending, or long-horizon investing.
The techniques above — Student's t or bootstrapped distributions, block resampling, joint-asset resampling, and reporting the full set of decision-relevant statistics — are the bare minimum for producing a simulation that someone should actually rely on. Anything less is theatre.
Continue reading
Unlock the rest of this article, the full premium archive, and advanced models.
- Full premium research archive
- Advanced models & tools
- Early access to new research
Already a member? Sign in