A New Algorithm for Identifying and Quantifying of Latent Classes
УДК 519.259
Abstract
Processing large amounts of data can be greatly simplified if this data is divided into approximately homogeneous groups. Splitting into such groups is the task of cluster analysis. However, the question of constructing an objective, natural partition into clusters remains open. The paper considers a modern approach to the search for such an objective cluster structure by highlighting the indicator of a common essential part from the set of characteristics that define objects (we call them the forming ones). When this indicator is fixed, the remains of the forming characteristics become independent or close to such. The resulting independent residuals are interpreted as a kind of information noise, and the latent cluster variable, the common fixed part that provides such a transformation, can be a reason for the objective integration of objects into clusters. A new algorithm for the formation of a cluster partition based on the proximity or coincidence of the values of a latent cluster variable with the simultaneous quantification of its values is proposed. The algorithm is based on the targeted search of partitions, the transition from the start one to the partition, more close to the objective. The algorithm proposed in the paper can be easily modified to the case of non-numeric categorized characteristics.
Downloads
Metrics
References
Johnson J.M., Khoshgoftaar T.M. Survey on deep learning with class imbalance // J. Big Data. 2019. Vol. 6, 27. DOI 10.1186/s40537-019-0192-5.
Wu J., Dong M., Ota K., Li J. and Guan Z. Big Data Analysis-Based Secure Cluster Management for Optimized Control Plane in Software-Defined Networks // IEEE Transactions on Network and Service Management. 2018. Vol. 15. DOI: 10.1109/TNSM.2018.2799000.
Chen M., Mao S., Zhang Y., Leung V. Big Data. Related Technologies, Challenges, and Future Prospects. Spinger, 2014. DOI: 10.1007/978-3-319-06245-7.
Romesburg H.Ch. Cluster analysis for researchers. Lulu Press, 2007.
Chance B.L., Rossman A.J. Investigating statistical concepts, applications, and methods. Duxbury Press, 2013.
Mulaik S.A. Foundations of Factor Analysis. Boca Raton, 2009.
Bryukhanova E.A., Chekryzhova O.I., Dronov S.V. Spatial Approach to the Analysis of the Employment Data in Siberia Based on the 1897 Census (the Experience of the Multivariate Statistical Analysis of the Districts Data) // Journal of Siberian Federal University. Humanities & Social Sciences. 2016. № 7. DOI: 10.17516/1997-1370-20169-7-1651-1660.
Dronov S.V., Sazonova A.S. Two approaches to cluster variable quantification // Model Assisted Statistics and Applications. 2015. Vol. 10.
Vermunt J.K., Magidson J. Latent class cluster analysis // Applied latent class analysis. 2002. Vol. 11.
Rindskopf D. Latent Class Analysis. The SAGE Handbook of Quantitative Methods in Psychology. N.Y., 2009.
Gribel, D., Vidal T. HG-means: A scalable hybrid metaheuristic for minimum sum-of-squares clustering // Pattern Recognition. 2019. 88 (1). arXiv: 1804.09813.
Федоряева Т.И. Комбинаторные алгоритмы: учебное пособие. Новосибирск, 2011.
Дронов С.В. Методы и задачи многомерной статистики. Барнаул, 2015.
Copyright (c) 2020 Сергей Вадимович Дронов, Антон Юрьевич Шеларь

This work is licensed under a Creative Commons Attribution 4.0 International License.
Izvestiya of Altai State University is a golden publisher, as we allow self-archiving, but most importantly we are fully transparent about your rights.
Authors may present and discuss their findings ahead of publication: at biological or scientific conferences, on preprint servers, in public databases, and in blogs, wikis, tweets, and other informal communication channels.
Izvestiya of Altai State University allows authors to deposit manuscripts (currently under review or those for intended submission to Izvestiya of Altai State University) in non-commercial, pre-print servers such as ArXiv.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).



