×

Enhancing density-based data reduction using entropy. (English) Zbl 1101.68490

Summary: Data reduction algorithms determine a small data subset from a given large data set. In this article, new types of data reduction criteria, based on the concept of entropy, are first presented. These criteria can evaluate the data reduction performance in a sophisticated and comprehensive way. As a result, new data reduction procedures are developed. Using the newly introduced criteria, the proposed data reduction scheme is shown to be efficient and effective. In addition, an outlier-filtering strategy, which is computationally insignificant, is developed. In some instances, this strategy can substantially improve the performance of supervised data analysis. The proposed procedures are compared with related techniques in two types of application: density estimation and classification. Extensive comparative results are included to corroborate the contributions of the proposed algorithms.

MSC:

68P05 Data structures
Full Text: DOI

References:

[1] DOI: 10.1002/int.1068 · Zbl 0997.68125 · doi:10.1002/int.1068
[2] DOI: 10.1016/S0004-3702(97)00063-5 · Zbl 0904.68142 · doi:10.1016/S0004-3702(97)00063-5
[3] DOI: 10.1109/T-C.1974.223827 · Zbl 0292.68044 · doi:10.1109/T-C.1974.223827
[4] Chow T. W. S., IEEE. Trans. on Circuits and Systems (2004)
[5] DOI: 10.1109/TIT.1972.1054809 · doi:10.1109/TIT.1972.1054809
[6] DOI: 10.1109/MASSP.1984.1162229 · doi:10.1109/MASSP.1984.1162229
[7] DOI: 10.1109/TIT.1968.1054155 · doi:10.1109/TIT.1968.1054155
[8] DOI: 10.1109/29.56063 · doi:10.1109/29.56063
[9] DOI: 10.1109/TPAMI.2002.1008381 · doi:10.1109/TPAMI.2002.1008381
[10] DOI: 10.1214/aoms/1177704472 · Zbl 0116.11302 · doi:10.1214/aoms/1177704472
[11] DOI: 10.1023/A:1009876119989 · doi:10.1023/A:1009876119989
[12] DOI: 10.1109/72.207618 · doi:10.1109/72.207618
[13] Schapire R. E., Machine Learning 5 pp 197– (1990)
[14] DOI: 10.1023/A:1007626913721 · Zbl 0954.68126 · doi:10.1023/A:1007626913721
[15] DOI: 10.1109/34.917574 · doi:10.1109/34.917574
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.