Draft:Bayesian model comparison
Draft article not currently submitted for review.
This is a draft Articles for creation (AfC) submission. It is not currently pending review. While there are no deadlines, abandoned drafts may be deleted after six months. To edit the draft click on the "Edit" tab at the top of the window. To be accepted, a draft should:
It is strongly discouraged to write about yourself, your business or employer. If you do so, you must declare it. Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
Last edited by Bearcat (talk | contribs) 38 days ago. (Update) |
Key Concepts | |
---|---|
Bayes Factors | A ratio of marginal likelihoods that quantifies the evidence in favor of one model compared to another. |
Marginal Likelihood (Evidence) | The probability of the observed data given a model, integrated over all possible parameter values. |
Posterior Model Probability | The probability that a model is true given the observed data and prior information. |
Information Criteria | Approximations to Bayes Factors, e.g., BIC, AIC, DIC. |
Predictive Accuracy | How well a model predicts new or unseen data, often assessed through cross-validation or WAIC. |
Model Averaging | Combining predictions from multiple models, weighted by their posterior probabilities or predictive performance. |
Methods | |
Separate Estimation | Comparing models based on posterior predictive distributions, Bayes factors, and information criteria. |
Comparative Estimation | Assessing the 'distance' between posterior distributions using measures like Kullback-Leibler divergence. |
Simultaneous Estimation | Exploring the model space using techniques like reversible jump MCMC (RJMCMC) or birth-and-death MCMC (BDMCMC). |
Bayesian model comparison means comparing how well statistical models fit to data by Bayesian statistics. It is used for diverse tasks like variable selection in regression, determining the number of components in a mixture model, and choosing parametric families. The goal of model comparison may be selecting a single "best" model, or improve estimation via model ensemble averaging, where expectation values from different models are weighted-averaged by their posterior probabilities.
Common methods for Bayesian model comparison include:
- Separate estimation: Comparing models through posterior predictive distributions, Bayes factors, and approximations like BIC and DIC.
- Comparative estimation: Assessing the "distance" between posterior distributions using measures like Kullback-Leibler divergence.
- Simultaneous estimation: Exploring the model space using techniques like RJMCMC or BDMCMC.
Setup
[edit]Bayesian evidence, or marginal likelihood, for a model is the average likelihood of observing the data under the prior distribution of the model parameters :When comparing two models, and , the Bayes factor is the ratio of their evidences:A Bayes factor greater than 1 favors , while a value less than 1 favors . The magnitude of the Bayes factor reflects the strength of evidence, often interpreted using Jeffreys' scale.
Generally, the prior probability is chosen to quantify Occam's razor. A model with many free parameters will generally fit the data better, but it may overfit and perform poorly on new, unseen data. This can be quantified by choosing a prior distribution that decreases with model parameter count.
Bayesian complexity measures the effective number of parameters that the data can support, accoutning for parameters that are unconstrained by the data.[1]
Instead of choosing a single "best" model, Bayesian model averaging (BMA) combines predictions from multiple models, weighted by their posterior probabilities. This approach acknowledges uncertainty about the true model, incorporating it into the final inference.
Bayesian stacking, a more recent technique, weights models based on their out-of-sample predictive performance, using the entire dataset for model fitting. This method relaxes the assumption that the true model is within the set of candidate models.
Approximations
[edit]Calculating the Bayesian evidence involves multi-dimensional integration, often computationally demanding. Several approximation methods exist, including:
- Laplace approximation: Assumes a Gaussian likelihood and prior, simplifying the evidence integral.
- Thermodynamic integration (simulated annealing): A numerical integration technique for complex likelihoods.
- Nested sampling: Recasts the multi-dimensional integral into a simpler one-dimensional form.
Information criteria
[edit]A family of approximations to the Bayes factor were derived based on information theory, all named "information criteria". These rely on simplifying assumptions that may be satisfied in practice.[2] The most popular ones are:
- Akaike information criterion (AIC): Penalizes models based on the number of parameters.
- Bayesian information criterion (BIC): Similar to AIC, but with a stronger penalty for complexity.
- Deviance information criterion (DIC): Generalizes AIC to hierarchical modeling, using the effective number of parameters.
- Widely Applicable Information Criterion (WAIC): Generalizes AIC to singular statistical models, based on pointwise predictive densities.
Predictive accuracy
[edit]Model evaluation focuses on a model's predictive capacity rather than its fit to the observed data. Techniques like cross-validation and leave-one-out cross-validation (LOO-CV) partition the data to assess a model's performance on unseen data, mitigating overfitting.
Pareto smoothed importance sampling LOO-CV (PSIS-LOO-CV) enhances computational efficiency and stability of LOO-CV, particularly for complex models.
Separate Estimation
[edit]Consider two models, and . For prediction, a natural Bayesian approach compares models based on their posterior predictive distributions. Another approach involves comparing models using their posterior probabilities given the data. Using Bayes' rule, the choice between models can be made using the ratio:The second term in this ratio, the ratio of marginal likelihoods, is the Bayes factor (BF). It is obtained by integrating over all parameter values, not by maximizing as in likelihood ratios. While theoretically attractive, Bayes factors can be difficult to calculate, especially for complex models, and are sensitive to prior choices.
Approximations to the Bayes factor, such as BIC and DIC, provide computationally efficient alternatives. These criteria penalize models with greater complexity, favoring parsimonious models that adequately explain the data. However, these approximations rely on specific assumptions and may not be appropriate for all model types.
Other examples
[edit]Models can be compared by assessing the "distance" between their posterior (or posterior predictive) distributions. If the distance is small, the more parsimonious model might be preferred. Examples include the Kullback-Leibler divergence and entropy distance measures.
MCMC methods
[edit]Markov chain Monte Carlo (MCMC) can be used to perform Bayesian model selection. The idea is to construct an MCMC chain in the space of possible models , such that the MCMC chain samples the space of possible models according to the model posterior distribution, or some other distribution.
Reversible jump MCMC (or trans-dimensional MCMC)[3], allows "jumps" between models of different dimensions. Birth and death MCMC[4][5] is an alternative that models the time between jumps as a random variable, with model probabilities determined by the time spent in each model.
Applications
[edit]Mixture Models
[edit]Mixture models are widely used for data exhibiting heterogeneity. Several techniques exist for comparing mixture models. For instance, the DIC can be used when the mixture model is well defined. In other cases, alternative DIC estimators tailored for mixture models can be employed. Bayes factors, posterior predictive checks, and visual inspection of model fits also aid in selecting appropriate mixture models.
References
[edit]General references
[edit]- Gelman, Andrew (2014). Bayesian data analysis. Chapman & Hall/CRC texts in statistical science (Third ed.). Boca Raton: CRC Press. ISBN 978-1-4398-4095-5.
- Congdon, P. (2007). Bayesian Statistical Modelling. Wiley Series in Probability and Statistics. Wiley. ISBN 978-0-470-03593-1.
- Robert, Christian P.; Casella, George (2004). "Monte Carlo Statistical Methods". Springer Texts in Statistics. New York, NY: Springer New York. doi:10.1007/978-1-4757-4145-2. ISBN 978-1-4419-1939-7. ISSN 1431-875X.
- Kruschke, John K. (2015). "Model Comparison and Hierarchical Modeling". Doing Bayesian Data Analysis. Elsevier. pp. 265–296. doi:10.1016/b978-0-12-405888-0.00010-6. ISBN 978-0-12-405888-0.
- K. P. Burnham and D. R. Anderson, Model Selection and Multi-model Inference: A Practical Information-theoretic Approach, 2nd edn (Springer, New York, 2002).
- D. MacKay, Information theory, inference, and learning algorithms (Cambridge University Press, Cambridge, UK, 2003).
- Aitkin, M. (1997). The calibration of P-values, posterior Bayes factors and the AIC from the posterior distribution of the likelihood (with discussion). Statist. And Computing 7, 253-272.
- Celeux, G., Forbes, F., Robert, C.P. and Titterington, D.M. (2003). Deviance information criteria for missing data models. Cahiers du Ceremade 0325.
- Congdon, P. (2001). Bayesian Statistical Modelling. Wiley, England.
- Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (1995). Bayesian Data Analysis. Chapman and Hall, London.
- George, E. and McCulloch, R. (1993). Variable selection via Gibbs sampling. J. American Statist. Association 88(423), 881-889.
- Green, P. (1995). Reversible jump MCMC computation and Bayesian model determination. Biometrika 82(4), 711-732.
- Kass, R. and Raftery, A. (1995). Bayes factors. J. American Statist. Assoc. 90, 773-795.
- Perez, J.M. and Berger, J. (2002). Expected posterior prior distributions for model selection. Biometrika 89, 491-512.
- Richardson, S. and Green, P. (1997). On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. Royal Statist. Soc. Series B 59 731-792.
- Robert, C. and Casella, G. (2004). Monte Carlo Statistical Methods. Springer-Verlag, New York, second edition.
- Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. Royal Statist. Society Series B 64(3), 583-639.
See also
[edit]- Bayes factor
- Bayesian information criterion
- Deviance information criterion
- Reversible-jump Markov chain Monte Carlo
- Markov chain Monte Carlo
- ^ Spiegelhalter, David J.; Best, Nicola G.; Carlin, Bradley P.; Van Der Linde, Angelika (2002-10-01). "Bayesian Measures of Model Complexity and Fit". Journal of the Royal Statistical Society Series B: Statistical Methodology. 64 (4): 583–639. doi:10.1111/1467-9868.00353. ISSN 1369-7412.
- ^ Konishi, Sadanori; Kitagawa, Genshiro (2008). Information Criteria and Statistical Modeling. Springer Series in Statistics. New York, NY: Springer New York. doi:10.1007/978-0-387-71887-3. ISBN 978-0-387-71886-6.
- ^ Green, Peter J. (1995). "Reversible jump Markov chain Monte Carlo computation and Bayesian model determination". Biometrika. 82 (4): 711–732. doi:10.1093/biomet/82.4.711. ISSN 0006-3444.
- ^ Stephens, Matthew (2000). "Bayesian Analysis of Mixture Models with an Unknown Number of Components- An Alternative to Reversible Jump Methods". The Annals of Statistics. 28 (1): 40–74. doi:10.1214/aos/1016120364. ISSN 0090-5364. JSTOR 2673981.
- ^ Richardson, Sylvia.; Green, Peter J. (1997-11-01). "On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)". Journal of the Royal Statistical Society Series B: Statistical Methodology. 59 (4): 731–792. doi:10.1111/1467-9868.00095. ISSN 1369-7412.