Analysis of time series data with predictive clustering trees
Dzeroski, Saso Slavkov, Ivica Gjorgjioski, Valentin Struyf, Jan
Proceedings of the Fifth International Workshop on Knowledge Discovery in Inductive Databases pages:47-58
International Workshop on Knowledge Discovery in Inductive Databases edition:5 location:Berlin, Germany date:September 18, 2006
Predictive clustering is a general framework that unifies clustering and prediction. This paper investigates how to apply this framework to cluster time series data. The resulting system, Clus-TS, constructs predictive clustering trees (PCTs) that partition a given set of time series into homogeneous clusters. In addition, PCTs provide a symbolic description of the clusters. The paper considers several distance metrics to measure cluster homogeneity (both quantitative and qualitative). We evaluate Clus-TS on time series data from microarray experiments. Each data set records the change over time in the expression level of yeast genes in response to a change in environmental conditions. Our evaluation shows that Clus-TS is able to identify interesting clusters of genes with similar responses. Clus-TS is part of a larger project where the goal is to investigate how global models can be combined with inductive databases.