[Home ] [Archive]   [ فارسی ]  
:: Main :: About :: Current Issue :: Archive :: Search :: Submit :: Contact ::
Main Menu
Home::
Journal Information::
Articles archive::
For Authors::
For Reviewers::
Registration::
Ethics Considerations::
Contact us::
Site Facilities::
::
Search in website

Advanced Search
..
Receive site information
Enter your Email in the following box to receive the site news and information.
..
Indexing and Abstracting



 
..
Social Media

..
Licenses
Creative Commons License
This Journal is licensed under a Creative Commons Attribution NonCommercial 4.0
International License
(CC BY-NC 4.0).
 
..
Similarity Check Systems


..
:: Volume 18, Issue 1 (8-2024) ::
JSS 2024, 18(1): 0-0 Back to browse issues page
Robust Model-Based Clustering Using the Symmetric alpha-Stable Distribution for Measurement Error
Mozhgan Moradi , Shaho Zarei *
Abstract:   (1194 Views)
Model-based clustering is the most widely used statistical clustering method, in which heterogeneous data are divided into homogeneous groups using inference based on mixture models. The presence of measurement error in the data can reduce the quality of clustering and, for example, cause overfitting and produce spurious clusters. To solve this problem, model-based clustering assuming a normal distribution for measurement errors has been introduced. However, too large or too small (outlier) values ​​of measurement errors cause poor performance of existing clustering methods. To tackle this problem {and build a stable model against the presence of outlier measurement errors in the data}, in this article, a symmetric $alpha$-stable distribution is proposed as a replacement for the normal distribution for measurement errors, and the model parameters are estimated using the EM algorithm and numerical methods. Through simulation and real data analysis, the new model is compared with the MCLUST-based model, considering cases with and without measurement errors, and the performance of the proposed model  for data clustering in the presence of various outlier measurement errors is shown.
Keywords: Model-based clustering‎, ‎$alpha$-stable distribution‎, Measurement error‎, ‎EM‎‎ algorithm
Full-Text [PDF 418 kb]   (996 Downloads)    
Type of Study: Applied | Subject: Applied Statistics
Received: 2024/02/25 | Accepted: 2024/08/31 | Published: 2024/06/4
References
1. Bechtel, Y. C., Bonaiti-Pellie, C., Poisson, N., Magnette, J., ‎and‎ Bechtel, P. R. (1993), A Population and Family Study N-Acetyltransferase Using Caffeine Urinary Metabolites. Clinical Pharmacology &‎ Therapeutics‎, 54(2)‎, 134-141‎. [DOI:10.1038/clpt.1993.124] [PMID]
2. ‎Bouveyron‎, ‎C.‎, ‎Celeux‎, ‎G.‎, ‎Murphy‎, ‎T‎. ‎B.‎, ‎and Raftery‎, ‎A‎. ‎E‎. ‎(2019)‎, Model-Based Clustering and Classification for Data Science‎: ‎with Applications in R‎. Cambridge University Press‎.
3. ‎Dempster, A. P., Laird, N. M., and Rubin, D. B. ‎(1977)‎, Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the royal statistical society: series B (methodological)‎, 39(1)‎, 1-22‎. [DOI:10.1111/j.2517-6161.1977.tb01600.x]
4. ‎Dunn‎, ‎J‎. ‎C‎. ‎(1974)‎, ‎Well-Separated Clusters and Optimal Fuzzy Partitions‎, Journal of Cybernetics‎, 4(1)‎, ‎95-104‎. [DOI:10.1080/01969727408546059]
5. Fraley, C., and Raftery, A. E. (2003), Enhanced Model-Based Clustering, Density Estimation, and Discriminant Analysis Software: MCLUST. Journal of classification‎, 20(2), 263-286. [DOI:10.1007/s00357-003-0015-3]
6. ‎Fuller‎, ‎W‎. ‎A‎. ‎(2009)‎, Measurement Error Models‎, John Wiley & Sons‎.
7. ‎Hubert‎, ‎L.‎, ‎and Arabie‎, ‎P‎. ‎(1985)‎, Comparing Partitions‎. Journal of classification‎, 2, 193-218. [DOI:10.1007/BF01908075]
8. ‎Komárek‎, ‎A.‎, ‎and Komárková‎, ‎L‎. ‎(2014)‎, ‎Capabilities of R Package mixAK for Clustering Based on Multivariate Continuous and Discrete Longitudinal Data‎. Journal of Statistical Software‎, 59(12)‎, ‎1-38‎. [DOI:10.18637/jss.v059.i12]
9. ‎Kong‎, ‎A.‎, ‎McCullagh‎, ‎P.‎, ‎Meng‎, ‎X‎. ‎L.‎, ‎Nicolae‎, ‎D.‎, ‎and Tan‎, ‎Z‎. ‎(2009)‎, A Theory of Statistical Models for Monte Carlo Integration‎. Journal of the Royal Statistical Society‎: ‎Series B (Statistical Methodology)‎, 65(3)‎, ‎585-604‎. [DOI:10.1111/1467-9868.00404]
10. ‎Nolan‎, ‎J‎. ‎P‎. ‎(2020)‎, ‎ Stable Distributions‎: Models for Heavy-Tailed Data‎. Springer Cham‎.
11. ‎Pankowska‎, ‎P.‎, and ‎Oberski‎, ‎D‎. ‎L‎. ‎(2020)‎, ‎The effect of Measurement Error on Clustering Algorithms‎. arXiv preprint arXiv‎, :2005.11743.
12. ‎Ritter‎, ‎G‎. ‎(2015)‎, ‎Robust Cluster Analysis and Variable Selection‎, Vol‎. ‎137 of Chapman & Hall/CRC Monographs on Statistics & Applied Probability‎, ‎CRC Press‎.
13. ‎Rousseeuw‎, ‎P‎. ‎J‎. ‎(1987)‎, ‎Silhouettes‎: ‎a Graphical Aid to the Interpretation and Validation of Cluster Analysis‎, Journal of Computational and Applied Mathematics‎, 20‎, ‎53-65‎. [DOI:10.1016/0377-0427(87)90125-7]
14. ‎Salas-Gonzalez‎, ‎D.‎, ‎Kuruoglu‎, ‎E‎. ‎E.‎, ‎and Ruiz‎, ‎D‎. ‎P‎. ‎(2009)‎, Finite Mixture of α-Stable Distributions‎. Digital Signal Processing ‎, ‎250-264‎. [DOI:10.1016/j.dsp.2007.11.004]
15. ‎Samorodnitsky‎, ‎G‎. ‎and Taqqu‎, ‎M‎. ‎S‎. ‎(1994)‎, Stable non-Gaussian Random Processes‎, Chapman and Hall‎, ‎New York‎.
16. ‎Schwarz‎, ‎G‎. ‎(1978)‎, ‎Estimating the Dimension of a Model‎. The annals of statistics‎, ‎461-464‎.
17. Teimouri, M. (2020). Maximum Likelihood Estimator of the α-Stable Distribution, Journal of Statistical Sciences, 14, 73-94. [DOI:10.29252/jss.14.1.75]
18. ‎Scrucca‎, ‎L.‎, ‎Fop‎, ‎M.‎, ‎Murphy‎, ‎T‎. ‎B.‎, ‎and Raftery‎, ‎A‎. ‎E‎. ‎(2016), mclust 5‎: Mlustering‎, Classification and Ddensity Estimation using Gaussian Finite Mixture Models‎. Journal of the R‎, 8(1)‎, ‎205-233‎. [DOI:10.32614/RJ-2016-021] [PMID] []
19. Zarei, S. (2021). Robust Empirical Bayes Small Area Estimation with Symmetric α-Stable Distribution for Error Components, Journal of Statistical Sciences, 15(2), ‎463-480. [DOI:10.52547/jss.15.2.463]
20. ‎Zarei‎, ‎S‎.‎,‎ ‎and Mohammdpour‎, ‎A‎. ‎(2020)‎, ‎Pseudo-Stochastic EM for sub-Gaussian α-Stable Mixture Models‎. Digital Signal Processing. doi.org/10.1016/j.dsp.2020.102671‎. 99 102671‎. [DOI:10.1016/j.dsp.2020.102671]
21. ‎Zhang‎, ‎W.‎, and ‎Di‎, ‎Y‎. ‎(2020)‎, ‎Model-Based Clustering with Measurement or Estimation Errors‎, ‎‎ Genes‎, 11(2)‎, ‎185-209‎.‎ [DOI:10.3390/genes11020185] [PMID] []
Send email to the article author

Add your comments about this article
Your username or Email:

CAPTCHA



XML   Persian Abstract   Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Moradi M, Zarei S. Robust Model-Based Clustering Using the Symmetric alpha-Stable Distribution for Measurement Error. JSS 2024; 18 (1)
URL: http://jss.irstat.ir/article-1-888-en.html


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Volume 18, Issue 1 (8-2024) Back to browse issues page
مجله علوم آماری – نشریه علمی پژوهشی انجمن آمار ایران Journal of Statistical Sciences

Persian site map - English site map - Created in 0.07 seconds with 45 queries by YEKTAWEB 4700