Model-based clustering is the most widely used statistical clustering method, in which heterogeneous data are divided into homogeneous groups using inference based on mixture models. The presence of measurement error in the data can reduce the quality of clustering and, for example, cause overfitting and produce spurious clusters. To solve this problem, model-based clustering assuming a normal distribution for measurement errors has been introduced. However, too large or too small (outlier) values of measurement errors cause poor performance of existing clustering methods. To tackle this problem {and build a stable model against the presence of outlier measurement errors in the data}, in this article, a symmetric $alpha$-stable distribution is proposed as a replacement for the normal distribution for measurement errors, and the model parameters are estimated using the EM algorithm and numerical methods. Through simulation and real data analysis, the new model is compared with the MCLUST-based model, considering cases with and without measurement errors, and the performance of the proposed model for data clustering in the presence of various outlier measurement errors is shown.
Moradi M, Zarei S. Robust Model-Based Clustering Using the Symmetric alpha-Stable Distribution for Measurement Error. JSS 2024; 18 (1) URL: http://jss.irstat.ir/article-1-888-en.html