Title: Model selection for probabilistic clustering using cross-validated likelihood
Authors: Smyth, P # ×
Issue Date: Jan-2000
Publisher: Kluwer Academic Publishers
Series Title: Statistics and Computing vol:10 issue:1 pages:63-72
Abstract: Cross-validated likelihood is investigated as a tool for automatically determining the appropriate number of components (given the data) in finite mixture modeling, particularly in the context of model-based probabilistic clustering. The conceptual framework for the cross-validation approach to model selection is straightforward in the sense that models are judged directly on their estimated out-of-sample predictive performance. The cross-validation approach, as well as penalized likelihood and McLachlan's bootstrap method, are applied to two data sets and the results from all three methods are in close agreement. The second data set involves a well-known clustering problem from the atmospheric science literature using historical records of upper atmosphere geopotential height in the Northern hemisphere. Cross-validated likelihood provides an interpretable and objective solution to the atmospheric clustering problem. The clusters found are in agreement with prior analyses of the same data based on non-probabilistic clustering techniques.
ISSN: 0960-3174
Publication status: published
KU Leuven publication type: IT
Appears in Collections:Theoretical Physics Section
× corresponding author
# (joint) last author

Files in This Item:

There are no files associated with this item.

Request a copy


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science