Abstract
In this paper, we revisit the problem of clustering 1318 new variable stars found in the Milky way. Our recent work distinguishes these stars based on their light curves which are univariate series of brightness from the stars observed at discrete time points. This work proposes a new approach to look at these discrete series as continuous curves over time by transforming them into functional data. Then, functional principal component analysis is performed using these functional light curves. Clustering based on the significant functional principal components reveals two distinct groups of eclipsing binaries with consistency and superiority compared to our previous results. This method is established as a new powerful light curve-based classifier, where implementation of a simple clustering algorithm is effective enough to uncover the true clusters based merely on the first few relevant functional principal components. Simultaneously we discard the noise from the data study involving the higher order functional principal components. Thus the suggested method is very useful for clustering big light curve data sets which is also verified by our simulation study.
Similar content being viewed by others
Data Availability
All data analyzed and generated during this study are referenced in this published article.
References
Bandyopadhyay, U., Modak, S.: Bivariate density estimation using normal-gamma kernel with application to astronomy. J. Appl. Probab. Stat. 13, 23–39 (2018)
Batista, G.E.A.P.A., Keogh, E.J., Tataw, O.M., de Souza, V.M.A.: CID: an efficient complexity-invariant distance for time series. Data Min. Knowl. Discov. 28, 634–669 (2014)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Cassisi, C., Montalto, P., Aliotta, M., Cannata, A., Pulvirenti, A.: Similarity measures and dimensionality reduction techniques for time series data mining. In: Advances in Data Mining Knowledge Discovery and Applications, pp. 71–96. Intech, Rijeka (2012). Chap. 3
Chattopadhyay, T., Sinha, A., Chattopadhyay, A.K.: Influence of binary fraction on the fragmentation of Young massive clusters– a Monte Carlo simulation. Astrophys. Space Sci. 361, 120–133 (2016)
Craven, P., Wahba, G.: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31, 377–403 (1979)
de Boor, C.: A Practical Guide to Splines. Springer, New York (2001)
Deb, S., Singh, H.P.: Light curve analysis of variable stars using Fourier decomposition and principal component analysis. Astron. Astrophys. 507, 1729–1737 (2009)
Delaigle, A., Hall, P., Pham, T.: Clustering functional data into groups by using projections. J. R. Stat. Soc. Ser. B 81, 271–304 (2019)
Gu, C.: Smoothing Spline ANOVA Models. Springer, New York (2002)
Handl, J., Knowles, K., Kell, D.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21, 3201–3212 (2005)
Jacques, J., Preda, C.: Functional data clustering: a survey. Adv. Data Anal. Classif. 8, 231–255 (2014)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New Jersey (2005)
Kirk, B., Conroy, K., Prša, A., et al.: Kepler eclipsing binary stars. VII. The catalog of eclipsing binaries found in the entire Kepler data set. Astron. J. 151, 68–88 (2016)
Kochoska, A., Mowlavi, N., Prša, A., Lecoeur-Taïbi, I., Holl, B., Rimoldini, L., Süveges, M., Eyer, L.: Gaia eclipsing binary and multiple systems. A study of detectability and classification of eclipsing binaries with Gaia. Astron. Astrophys. 602, A110 (2017)
Malkov, O.Yu., Oblak, E., Avvakumova, E.A., Torra, J.: Classification of eclipsing binaries. In: Demircan, O., Selam, S.O., Albayrak, B. (eds.) Solar and Stellar Physics Through Eclipses. ASP Conference Series, vol. 370 (2007)
Matijevič, G., Prša, A., Orosz, J.A., Welsh, W.F., Bloemen, S., Barclay, T.: Kepler eclipsing binary stars. III. Classification of Kepler eclipsing binary light curves with locally linear embedding. Astron. J. 143, 123–128 (2012)
Miller, V.R., Albrow, M.D., Afonso, C., Henning , Th.: 1318 new variable stars in a 0.25 square degree region of the Galactic plane. Astron. Astrophys. 519, A12 (2010)
Modak, S.: Uncovering astrophysical phenomena related to galaxies and other objects through statistical analysis. Ph.D. Thesis (2019) http://hdl.handle.net/10603/314773
Modak, S.: Distinction of groups of gamma-ray bursts in the BATSE catalog through fuzzy clustering. Astron. Comput. 34, 100441 (2021a)
Modak, S.: A new nonparametric interpoint distance-based measure for assessment of clustering. J. Stat. Comput. Simul. (2021b, in press). https://doi.org/10.1080/00949655.2021.1984487
Modak, S.: A new measure for assessment of clustering based on kernel density estimation. Commun. Stat., Theory Methods (2022, in press). https://doi.org/10.1080/03610926.2022.2032168
Modak, S., Bandyopadhyay, U.: A new nonparametric test for two sample multivariate location problem with application to astronomy. J. Stat. Theory Appl. 18, 136–146 (2019)
Modak, S., Chattopadhyay, T., Chattopadhyay, A.K.: Two phase formation of massive elliptical galaxies: study through cross-correlation including spatial effect. Astrophys. Space Sci. 362, 206–215 (2017)
Modak, S., Chattopadhyay, A.K., Chattopadhyay, T.: Clustering of gamma-ray bursts through kernel principal component analysis. Commun. Stat., Simul. Comput. 47, 1088–1102 (2018)
Modak, S., Chattopadhyay, T., Chattopadhyay, A.K.: Unsupervised classification of eclipsing binary light curves through k-medoids clustering. J. Appl. Stat. 47, 376–392 (2020)
Mowlavi, N., Lecoeur-Taïbi, I., Holl, B., Rimoldini, L., Barblan, F., Prsa, A., Kochoska, A., Süveges, M., Eyer, L., Nienartowicz, K., Jevardat, G., Charnas, J., Guy, L., Audard, M.: Gaia eclipsing binary and multiple systems. Two-Gaussian models applied to OGLE-III eclipsing binary light curves in the Large Magellanic Cloud. Astron. Astrophys. 606, A92 (2017)
Percy, J.R.: Understanding Variable Stars. Cambridge University Press, New York (2007)
Prati, R.C., Batista, G.E.A.P.A.: A complexity-invariant measure based on fractal dimension for time series classification. Int. J. Nat. Comput. Res. 3, 59–73 (2012)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, W.T.: Numerical Recipes in C. The Art of Scientific Computing 2nd edn. pp. 105–128. Cambridge University Press, Cambridge (1992)
Ramsay, J.O., Silverman, B.W.: Applied Functional Data Analysis: Methods and Case Studies. Springer, New York (2002)
Ramsay, J.O., Silverman, B.W.: Functional Data Analysis. Springer, New York (2005)
Ramsay, J.O., Hooker, G., Graves, S.: Functional Data Analysis with R and MATLAB. Springer, New York (2009)
Soszyński, I., Udalski, A., Szymański, M.K., Wyrzykowski, Ł., Ulaczyk, K., Poleski, R., Pietrukowicz, P., Kozłowski, S., Skowron, D.M., Skowron, J., Mróz, P., Pawlak, M.: The OGLE collection of variable stars. Over 45 000 RR Lyrae stars in the Magellanic System. Acta Astron. 66, 131–147 (2016)
Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. Springer, New York (2002)
Süveges, M., Barblan, F., Lecoeur-Taïbi, I., Prša, A., Holl, B., Eyer, L., Kochoska, A., Mowlavi, N., Rimoldini, L.: Gaia eclipsing binary and multiple systems. Supervised classification and self-organizing maps. Astron. Astrophys. 603, A117 (2017)
Thieler, A.M., Backes, M., Fried, R., Rhode, W.: Periodicity detection in irregularly sampled light curves by robust regression and outlier detection. Stat. Anal. Data Min. 6, 73–89 (2013)
Thieler, A.M., Fried, R., Rathjens, J.: RobPer: an R package to calculate periodograms for light curves based on robust regression. J. Stat. Softw. 69, 1–36 (2016)
Ward, J.H. Jr.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)
Wei, Y.: Multi-dimensional time warping based on complexity invariance and its application in sports evaluation. In: 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 677–680. IEEE, Xiamen (2014)
Acknowledgements
The authors would like to thank the editors for encouraging the present work on Astrostatistics and one anonymous reviewer for its intriguing inquiries which helped the authors to present the results in a more convincing way.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Modak, S., Chattopadhyay, T. & Chattopadhyay, A.K. Clustering of eclipsing binary light curves through functional principal component analysis. Astrophys Space Sci 367, 19 (2022). https://doi.org/10.1007/s10509-022-04050-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10509-022-04050-9