Against Untuned Priors

A skeptical look at default priors in practical inference

2 min read
Likely
#bayesian#statistics#priors

We are liable to love the prior that flatters our ignorance.

When defaults are misaligned with domain constraints, posterior mass collects in implausible regions.1 The problem becomes apparent when we examine the posterior predictive distribution, as illustrated in Figure ? .

    Density
      │     ╱╲
   5  │    ╱  ╲      ╱‾‾╲
      │   ╱    ╲    ╱    ╲
   3  │  ╱      ╲  ╱      ╲
      │ ╱        ╲╱        ╲
   1  │╱──────────┼──────────╲
      └───────────┼───────────────→ θ
                  0

   Blue: Well-tuned prior
   Red:  Default uniform prior
Figure ?: Comparison of posterior distributions with well-tuned (blue) versus default uninformative (red) priors. The default prior allows implausible parameter values.

Hierarchical shrinkage, empirical Bayes, and prior predictive checks improve calibration when thoughtfully applied. The relationship between these approaches is shown in Figure ? .

    Informative ─┐

    Weakly       │  Empirical Bayes
    Informative  │      ↑
                 │  Hierarchical
                 │      ↑
    Uninform     │  Default (uniform/Jeffreys)

                 └─ Increasing Prior Knowledge
Figure ?: Hierarchy of prior specification methods, from least to most informative.

As shown in Figure ? , badly chosen priors fail simple posterior sanity checks. The key insight from Figure ? is that we can build up prior information systematically rather than defaulting to ignorance.