Thursday, June 25, 2026

Clinical Tools · Biostatistics · Bayesian Design

Bayesian Sample Size Calculator

Estimate per-arm sample size for a two-arm binary outcome trial using Beta-Binomial conjugate priors and a target posterior probability of treatment superiority.

Quick Answer

Bayesian sample size planning targets a posterior probability — such as P(treatment success rate > control | data) ≥ 0.95 — rather than frequentist power at fixed alpha. This calculator uses Beta-Binomial conjugate priors to estimate per-arm enrollment for two-arm binary trials, aligned with FDA adaptive design guidance. Priors must be pre-specified in the statistical analysis plan; final sizing requires simulation by a qualified biostatistician.

Planning criterion — posterior superiority
P(pT > pC | data, priors) ≥ γ
pC, pT = control and treatment success rates    γ = posterior threshold (e.g. 0.95)
Prior: p ~ Beta(α, β)    Posterior after s successes in n: Beta(α + s, β + n − s)

Simple mode — prior mean rates

Enter expected control and treatment success rates. The calculator builds Beta priors from prior strength (equivalent sample size) and finds the smallest n per arm where the approximate posterior probability of superiority meets your threshold at expected event counts.

Prior mean success rates

Weight of the prior relative to new data. 2 = weakly informative; 10+ borrows more historical information.

Decision threshold and planning assumptions
n per arm
subjects
Total N
subjects
P(superiority)
at expected data
Frequentist n
per arm (80% power)

Prior parameters — Beta(α, β) per arm

Specify conjugate Beta priors directly for control and treatment arms. Posterior updating follows Beta(α + successes, β + failures).

Control arm prior
Treatment arm prior
Threshold and expected data
n per arm
subjects
Total N
subjects
P(superiority)
at expected data
Prior means
p_C / p_T

Bayesian vs frequentist sample size

The standard frequentist sample size calculator sizes trials on power — the probability of detecting a prespecified effect at alpha = 0.05. Bayesian planning instead asks: after collecting n patients per arm, what is the probability that the treatment success rate exceeds control, given the data and pre-specified priors?

Informative priors (for example, historical control data) can reduce required enrollment when the borrowed information is credible. Vague priors behave similarly to frequentist approaches with larger n. Both frameworks require pre-specification; switching after unblinded data is not acceptable.

FDA adaptive and Bayesian guidance

FDA's 2019 adaptive design guidance describes pre-planned modifications — sample size re-estimation, arm dropping, enrichment — based on interim data. Bayesian posterior probabilities are commonly used as interim decision metrics in Phase II and adaptive Phase III designs.

FDA also provides guidance on Bayesian statistics in medical device trials. Drug and biologic sponsors should document prior elicitation, simulation of operating characteristics, handling of multiplicity, and sensitivity to prior choice in the statistical analysis plan before regulatory interaction.

Interim analysis and adaptive enrichment for pharma BD

Business development and portfolio teams evaluate whether a trial can stop early for success, expand in a biomarker-positive subgroup, or fail fast when posterior evidence is weak. Bayesian interim metrics support go/no-go decisions at data safety monitoring board (DSMB) reviews without waiting for fixed frequentist boundaries — provided the design was simulated upfront.

Adaptive enrichment — restricting Phase III to patients who responded in Phase II — changes effective sample size and prior assumptions. BD models should align enrollment caps, posterior thresholds, and probability-of-success estimates with the biostatistics group's simulation output, not a single-point calculator result.

Method note

This calculator uses a normal approximation to independent Beta posteriors at expected success counts (rounded from n × expected rate). A simulation-lite check with 5,000 Monte Carlo draws from the posteriors is reported when n is found. Results are planning approximations only; confirm with full trial simulation.

Educational purpose only: Bayesian sample size depends on priors, decision rules, interim schedules, and multiplicity. Final protocols require qualified biostatistician review aligned with ICH E9, ICH E9(R1), and FDA adaptive/Bayesian guidance.

Evidence & sources

Frequently Asked Questions

Frequentist planning targets power — the probability of rejecting the null hypothesis when a prespecified effect is true — at a fixed alpha. Bayesian planning often targets a posterior probability, such as P(treatment rate > control rate | data) ≥ 0.95. Priors can shrink required sample size when historical data exist, but conclusions become sensitive to prior choice and must be pre-specified in the statistical analysis plan.
For a two-arm binary trial, posterior probability of superiority is P(p_treatment > p_control | observed data and prior). Values near 1.0 indicate strong Bayesian evidence that the treatment success rate exceeds control; values near 0.5 indicate equipoise. It is a direct probability statement about the parameter, unlike a p-value which measures evidence against a null hypothesis.
Yes. FDA has issued guidance on the use of Bayesian statistics in medical device clinical trials and discusses Bayesian approaches in adaptive design guidance. Sponsors must pre-specify priors, decision rules, operating characteristics, and handling of multiplicity. Bayesian designs are more common in early-phase and adaptive settings but require rigorous simulation of frequentist and Bayesian operating characteristics.
FDA's 2019 draft guidance on adaptive designs for clinical trials describes pre-planned modifications based on accumulating data, including sample size re-estimation, arm dropping, and enrichment. Bayesian posterior probabilities are often used as decision metrics at interim looks, but the guidance emphasizes controlling Type I error or documenting Bayesian operating characteristics when adaptations are data-driven.
Very sensitive. Informative priors centered on historical control or treatment rates can substantially reduce planned n, while vague Beta(1,1) priors behave similarly to frequentist approaches with more data required. Sensitivity analyses across plausible priors should accompany any Bayesian sample size recommendation.
Use Bayesian sizing when the protocol or statistical analysis plan specifies a posterior probability threshold, when borrowing historical control data is scientifically justified, or when planning adaptive or platform trials with Bayesian interim decision rules. For standard confirmatory superiority trials with fixed alpha and power, the frequentist sample size calculator is usually the primary planning tool.
With a Beta(α, β) prior on a binomial success probability and n observed successes out of N trials, the posterior is Beta(α + n, β + N − n). This conjugacy allows closed-form posterior updating and efficient computation of P(treatment > control) when each arm has an independent Beta prior.
Prior strength (sometimes called prior effective sample size) expresses how much weight the prior carries relative to new data. A Beta prior with equivalent sample size 10 contributes information similar to 10 hypothetical patients. Higher prior strength pulls posterior estimates toward the prior mean and can reduce required new enrollment when the prior is accurate.
Interim looks compute posterior probabilities with partial data. A trial may stop early for efficacy when P(superiority) exceeds a threshold, or continue enrollment when uncertainty remains. Planned sample size must reflect interim schedules, alpha or Bayesian spending, and the possibility of adaptation — not just the final analysis alone.
Yes, when the historical population, endpoint definition, and standard of care are exchangeable with the current control arm. Methods range from power priors to meta-analytic priors. Regulators expect justification of comparability, sensitivity analyses without borrowing, and documentation of effective sample size borrowed.
A cap reflects budget, recruitment feasibility, or protocol limits. If the posterior threshold cannot be reached within the cap at expected event rates, the calculator reports that the design goal is infeasible under current assumptions — prompting revision of thresholds, priors, or expected effect size before protocol finalization.
No. This tool provides an approximate planning estimate using normal approximation to Beta posteriors and expected event counts. Final sample size requires simulation of full operating characteristics, multiplicity adjustments, dropout, stratification, and review by a qualified biostatistician aligned with ICH E9 and the statistical analysis plan.

Related Clinical Tools