×

Feature selection based on fuzzy joint mutual information maximization. (English) Zbl 1472.94036

Summary: Nowadays, real-world applications handle a huge amount of data, especially with high-dimension features space. These datasets are a significant challenge for classification systems. Unfortunately, most of the features present are irrelevant or redundant, thus making these systems inefficient and inaccurate. For this reason, many feature selection (FS) methods based on information theory have been introduced to improve the classification performance. However, the current methods have some limitations such as dealing with continuous features, estimating the redundancy relations, and considering the outer-class information. To overcome these limitations, this paper presents a new FS method, called Fuzzy Joint Mutual Information Maximization (FJMIM). The effectiveness of our proposed method is verified by conducting an experimental comparison with nine of conventional and state-of-the-art feature selection methods. Based on 13 benchmark datasets, experimental results confirm that our proposed method leads to promising improvement in classification performance and feature selection stability.

MSC:

94A15 Information theory (general)
94A16 Informational aspects of data analysis and big data

Software:

UCI-ml

References:

[1] L, A novel feature selection method based on normalized mutual information, Appl. Intell., 37, 100-120, 2012 · doi:10.1007/s10489-011-0315-y
[2] J, A review of feature selection methods based on mutual information, Neural Comput. Appl., 24, 175-186, 2014 · doi:10.1007/s00521-013-1368-0
[3] I. K. Fodor, A survey of dimension reduction techniques, Lawrence Livermore National Lab, CA (US), 2002.
[4] H, Feature space theory—a mathematical foundation for data mining, Knowl. Based Syst., 14, 253-257, 2001 · doi:10.1016/S0950-7051(01)00103-4
[5] R, A novel approach to feature selection based on analysis of class regions, IEEE Trans. Syst. Man Cybern. Syst., 27, 196-207, 1997 · doi:10.1109/3477.558798
[6] Y, A review of feature selection techniques in bioinformatics, Bioinformatics, 23, 2507-2517, 2007 · doi:10.1093/bioinformatics/btm344
[7] I, An introduction to variable and feature selection, J. Mach. Learn. Res., 3, 1157-1182, 2003 · Zbl 1102.68556
[8] M, Feature selection using joint mutual information maximisation, Expert Syst. Appl., 42, 8520-8532, 2015 · doi:10.1016/j.eswa.2015.07.007
[9] Q, Information-preserving hybrid data reduction based on fuzzy-rough techniques, Pattern Recognit. Lett., 27, 414-423, 2006 · doi:10.1016/j.patrec.2005.09.004
[10] C. Lazar, J. Taminau, S. Meganck, D. Steenhoff, A. Coletta, C. Molter, et al., A survey on filter techniques for feature selection in gene expression microarray analysis, IEEE/ACM Trans. Comput. Biol. Bioinf., 9 (2012), 1106-1119.
[11] G, A survey on feature selection methods, Comput. Electr. Eng., 40, 16-28, 2014 · doi:10.1016/j.compeleceng.2013.11.024
[12] O, Fuzzy mutual information feature selection based on representative samples, Int. J. Software Innovation, 6, 58-72, 2018 · doi:10.4018/IJSI.2018010105
[13] D, Feature selection based on inference correlation, Intell. Data Anal., 15, 375-398, 2011 · doi:10.3233/IDA-2010-0473
[14] R, The mutual information: detecting and evaluating dependencies between variables, Bioinformatics, 18, S231-S240, 2002 · doi:10.1093/bioinformatics/18.suppl_2.S231
[15] J, Feature selection by maximizing independent classification information, IEEE Trans. Knowl. Data Eng., 29, 828-841, 2017 · doi:10.1109/TKDE.2017.2650906
[16] F, Theoretical foundations of forward feature selection methods based on mutual information, Neurocomputing, 325, 67-89, 2019 · doi:10.1016/j.neucom.2018.09.077
[17] D, Fuzzy mutual information based min-redundancy and max-relevance heterogeneous feature selection, Int. J. Comput. Intell. Syst., 4, 619-633, 2011 · doi:10.1080/18756891.2011.9727817
[18] J, A new method for measuring uncertainty and fuzziness in rough set theory, Int. J. Gen. Syst., 31, 331-342, 2002 · Zbl 1010.94004 · doi:10.1080/0308107021000013635
[19] Z, Uncertainty measurement for a fuzzy relation information system, IEEE Trans. Fuzzy Syst., 27, 2338-2352, 2019
[20] C, Uncertainty measures for general fuzzy relations, Fuzzy Sets Syst., 360, 82-96, 2019 · Zbl 1423.68516 · doi:10.1016/j.fss.2018.07.006
[21] Y, Some new approaches to constructing similarity measures, Fuzzy Sets Syst., 234, 46-60, 2014 · Zbl 1315.03096 · doi:10.1016/j.fss.2013.03.008
[22] G, A new perspective for information theoretic feature selection, Artif. Intell. Stat., 49-56, 2009
[23] D. D. Lewis, Feature selection and feature extract ion for text categorization, Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, 1992, 23-26.
[24] R, Using mutual information for selecting features in supervised neural net learning, IEEE Trans. Neural Netw. Learn. Syst., 5, 537-550, 1994 · doi:10.1109/72.298224
[25] H, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27, 1226-1238, 2005 · doi:10.1109/TPAMI.2005.159
[26] H, Feature selection based on joint mutual information, Proc. Int. ICSC Symp. Adv. Intell. Data Anal., 22-25, 1999
[27] F, Fast binary feature selection with conditional mutual information, J. Mach. Learn. Res., 5, 1531-1555, 2004 · Zbl 1222.68200
[28] P. E. Meyer, G. Bontempi, On the use of variable complementarity for feature selection in cancer classification, Workshops on applications of evolutionary computation, Springer, Berlin, Heidelberg, 2006, 91-102.
[29] A, A powerful feature selection approach based on mutual information, Int. J. Comput. Sci. Network Secur., 8, 116, 2008
[30] P, Normalized mutual information feature selection, IEEE Trans. Neural Networks, 20, 189-201, 2009 · doi:10.1109/TNN.2008.2005601
[31] N, Mifs-nd: a mutual information-based feature selection method, Expert Syst. Appl., 41, 6371-6385, 2014 · doi:10.1016/j.eswa.2014.04.019
[32] G, Mutual information-based method for selecting informative feature sets, Pattern Recognit., 46, 3315-3327, 2013 · doi:10.1016/j.patcog.2013.04.021
[33] J, Class-dependent discretization for inductive learning from continuous and mixed-mode data, IEEE Trans. Pattern Anal. Mach. Intell., 17, 641-651, 1995 · doi:10.1109/34.391407
[34] Q, Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring, Pattern Recognit., 37, 1351-1363, 2004 · Zbl 1070.68600 · doi:10.1016/j.patcog.2003.10.016
[35] J, Complement information entropy for uncertainty measure in fuzzy rough set and its applications, Soft Comput., 19, 1997-2010, 2015 · Zbl 1359.68277 · doi:10.1007/s00500-014-1387-5
[36] H, An efficient fuzzy classifier with feature selection based on fuzzy entropy, IEEE Trans. Syst. Man Cybern. Syst., 31, 426-432, 2001 · doi:10.1109/3477.931536
[37] I, Quadratic programming feature selection, J. Mach. Learn. Res., 11, 1491-1516, 2010 · Zbl 1242.68245
[38] K, The feature selection problem: Traditional methods and a new algorithm, Aaai, 2, 129-134, 1992
[39] K, Efficient feature selection using shrinkage estimators, Mach. Learn., 108, 1261-1286, 2019 · Zbl 1472.68156 · doi:10.1007/s10994-019-05795-1
[40] X, Input feature selection method based on feature set equivalence and mutual information gain maximization, IEEE Access, 7, 151525-151538, 2019 · doi:10.1109/ACCESS.2019.2948095
[41] P, Feature selection considering weighted relevancy, Appl. Intell., 1-11
[42] S, A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning, IEEE Trans. Knowl. Data Eng., 25, 734-750, 2012
[43] A, Classification assessment methods, Appl. Comput. Inform., 2020
[44] M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, et al., A brief survey of text mining: Classification, clustering and extraction techniques, preprint, arXiv: 1707.02919.
[45] R, A study of cross-validation and bootstrap for accuracy estimation and model selection, Ijcai, 14, 1137-1145, 1995
[46] S. Nogueira, G. Brown, Measuring the stability of feature selection, Joint European conference on machine learning and knowledge discovery in databases, Springer, Cham, 2016,442-457.
[47] Y. S. Tsai, U. C. Yang, I. F. Chung, C. D. Huang, A comparison of mutual and fuzzy-mutual information-based feature selection strategies, 2013 IEEE international conference on fuzzy systems (FUZZ-IEEE), IEEE, 2013, 1-6.
[48] L, A stability index for feature selection, Artificial intelligence and applications, 421-427, 2007
[49] D. Dua, C. Graff, UCI machine learning repository, 2017. Available from: http://archive.ics.uci.edu/ml.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.