×

Selection and optimization of cut-points for numeric attribute values. (English) Zbl 1186.68391

Summary: Data discretization is the process of setting several cut-points which can represent attribute values using different symbols or integer values for continuous numeric attribute values. A hybrid method based on neural network and genetic algorithm is proposed to select and optimize the cut-points for numeric attribute values. The values of cuts are trained through the four-layer neural network and the number of cut-points is optimized by the genetic algorithm. The results for intervals through the presented method can be more precise. The experimental results show that the cut-points are well obtained compared with the other method.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml
Full Text: DOI

References:

[1] Liu, H.; Hussain, F.; Tan, C. L.; Dash, M., Discretization: An enabling technique, J. Data mining and knowledge discovery, 6, 393-423 (2002)
[2] U.M. Fayyad, B.K. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: Proceedings of the Thirteenth International Joint Conference on AI, Chambery, France, 1993, pp. 1022-1027; U.M. Fayyad, B.K. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: Proceedings of the Thirteenth International Joint Conference on AI, Chambery, France, 1993, pp. 1022-1027
[3] S.H. Nguyen, A. Skowron, Quantization of real-valued attributes, rough set and boolean reasoning approaches, in: Proceedings of the Second Joint Annual Conference on Information Sciences, Wrightsville Beach, North Carolina, USA, 1995, pp. 34-37; S.H. Nguyen, A. Skowron, Quantization of real-valued attributes, rough set and boolean reasoning approaches, in: Proceedings of the Second Joint Annual Conference on Information Sciences, Wrightsville Beach, North Carolina, USA, 1995, pp. 34-37
[4] Ghosh, J.; Taha, I., A neuro-symbolic hybrid intelligent architecture with applications, (Jain, L.; Fanelli, A. M., Recent Advances in Artificial Neural Networks Design and Applications (2000), CRC Press), 2-37
[5] C.L. Blake, P.M. Murphy, The UCI machine learning repository, in: Irvine, CA: University of California, Department of Information and Computer Science, 1998. http://www.cs.uci.edu/mlearn/MLRepository.html; C.L. Blake, P.M. Murphy, The UCI machine learning repository, in: Irvine, CA: University of California, Department of Information and Computer Science, 1998. http://www.cs.uci.edu/mlearn/MLRepository.html
[6] Yu, J.; Li, X.; Sun, L., Global discretization of continuously valued attributes, J. Harbin Inst. Technol., 32, 48-53 (2000)
[7] Lee, C. H., A hellinger-based discretization method for numeric attributes in classification learning, J. Knowledge-Based Systems, 20, 419-425 (2007)
[8] E.H. Wu, M.K. Ng, A.M. Yip, T.F. Chan, Discretization of Multidimensional Web Data for Informative Dense Regions Discovery, in: Proceedings of the First International Symposium, CIS 2004, Shanghai, China, 2007, pp. 718-724; E.H. Wu, M.K. Ng, A.M. Yip, T.F. Chan, Discretization of Multidimensional Web Data for Informative Dense Regions Discovery, in: Proceedings of the First International Symposium, CIS 2004, Shanghai, China, 2007, pp. 718-724
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.