Abstract
A domain of interest for data mining applications is the study of biomedical data which, in combination with the field of image processing, provide thorough analysis in order to discover hidden patterns or behavior. Towards this direction, the present paper deals with the detection of breast cancer within digital mammography images. Identification of breast cancer poses several challenges to traditional data mining applications, particularly due to the high dimensionality and class imbalance of training data. In the current approach, genetic algorithms are utilized in an attempt to reduce the feature set to the informative ones and class imbalance issues were also dealt by incorporating a hybrid boosting and genetic sub-sampling approach. As regards to the feature extraction approach, the idea of trainable segmentation is borrowed, using Decision Trees as the base learner. Results show that the best precision and recall rates are achieved by using a combination of Adaboost and k-Nearest Neighbor.
Chapter PDF
Similar content being viewed by others
References
Altman, M.B., Flynn, M.J., Nishikawa, R.M., Chetty, I.J., Barton, K.N., Movsas, B., Kim, J.H., Brown, S.L.: The potential of iodine for improving breast cancer diagnosis and treatment. Medical Hypotheses 80(1) (2013)
Burget, R., Uher, V., Masek, J.: Trainable Segmentation Based on Local-level and Segment-level Feature Extraction. In: ISBI Conference (2012)
Fallahi, A., Jafari, S.: An Expert System for Detection of Breast Cancer Using Data Preprocessing and Bayesian Network. International Journal of Advanced Science and Technology 34 (2011)
Rani, U.K.: Parallel Approach for Diagnosis of Breast Cancer using Neural Network Technique. International Journal of Computer Applications 10(3) (2010)
Chang, W.P., Liou, D.L.: Comparison of Three Data Mining Techniques with Genetic Algorithm in the Analysis of Breast Cancer Data.
Nock, R., Nielsen, F.: Statistical Region Merging. IEEE Transactions of Pattern Analysis and Machine Intelligence 26(11) (2004)
Lykothanasis, S.: Genetic Algorithms and Applications (Greek: Γενετικοί αλγόριθμοι και εφαρμογές) (2001)
Schapire, R.E., Freund, Y.: Boosting: foundations and algorithms (2012)
Freund, Y., Schapire, R.E.: A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence (1999) (In Japanese, translation by Naoki Abe.)
Bremner, D., Demaine, E., Erickson, J., Iacono, J., Langerman, S., Morin, P., Toussaint, G.: Output-sensitive algorithms for computing nearest-neighbor decision boundaries. Discrete and Computational Geometry (2005)
Hamming, R.W.: Error detecting and error correcting codes. Bell System Technical Journal 29 (1950)
Weinberger, B.K.Q., Saul, L.K.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. In: Advances in Neural Information Processing Systems, NIPS, vol. 18 (2006)
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood Component Analysis. In: Advances in Neural Information Processing Systems, vol. 17, pp. 513–520 (2005)
Hripcsak, G., Rothschild, A.S.: Agreement, the F-Measure, and Reliability in Information Retrieval. American Medical Informatics Association (2005)
Goldberg, D.E.: A Note on Boltzmann Tournament Selection for Genetic Algorithms and Population-Oriented Simulated Annealing. Complex Systems 4, 445–460 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 IFIP International Federation for Information Processing
About this paper
Cite this paper
Kontos, K., Maragoudakis, M. (2013). Breast Cancer Detection in Mammogram Medical Images with Data Mining Techniques. In: Papadopoulos, H., Andreou, A.S., Iliadis, L., Maglogiannis, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2013. IFIP Advances in Information and Communication Technology, vol 412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41142-7_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-41142-7_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41141-0
Online ISBN: 978-3-642-41142-7
eBook Packages: Computer ScienceComputer Science (R0)