×

A mathematical foundation of big data. (English) Zbl 1442.68216

Summary: The recent research evolution on big data has brought exciting aspiration to mathematicians, computer scientists and business professionals alike. However, the lack of a sound mathematical foundation presents itself as a real challenge amidst the swarm of big data marketing activities. This paper intends to propose a possible mathematical theory as a foundation for big data research. Specifically, we propose the concept of the adjective “big” as a mathematical operator, furthermore, the concept of so-called “big” logically and naturally fits the concept of being “linguistics variable” as per fuzzy logic research community for decades. The consequence of adopting such a mathematical modeling can be profoundly considered as an abstraction of the technologies, systems, tools for data management and processing that transforms data into big data. In addition, the concept of infinity of the big data is based on the theory of calculus and the set theory. Furthermore, the concept of relativity of the big data, as we find out, is based on the operations of the fuzzy subsets theory. The proposed approach in this paper, we hope, can facilitate and open up more opportunities for big data research and developments on big data analytics, business analytics, big data intelligence, big data computing as well as big data science.

MSC:

68T09 Computational aspects of data analysis and big data
68T37 Reasoning under uncertainty in the context of artificial intelligence
Full Text: DOI

References:

[1] Kumar, B., An encyclopedic overview of ‘big data’ analytics, International Journal of Applied Engineering Research10 (3) (2015) 5681-5705.
[2] McKinsey, The digital tipping point: McKinsey Global Survey Results (2014), http://www.mckinsey.com/insights/business_technology/the_digital_tipping_point_mckinsey_global_survey_results.
[3] J. Manyika, M. Chui and J. Bughin, Big data: The next frontier for innovation, competition, and productivity (2011), http://www.mckinsey.com/business-functions/business-technology/our-insights/big-data-the-next-frontier-for-innovation.
[4] Chen, M., Mao, S. and Liu, Y., Big Data: A Survey, Mobile Networks and Applications19 (2) (2014) 171-209.
[5] Chen, C. P. and Zhang, C.-Y., Data-intensive applications, challenges, techniques and technologies: A survey on Big Data, Information Sciences275 (2014) 314-347.
[6] Khosrow-Pour, M., Big Data: Concepts, Methodologies, Tools, and Applications: Concepts (IGI Global, Hershey, 2015).
[7] L. Hardesty, Automating big-data analysis: With new algorithms, data scientists could accomplish in days what has traditionally taken months (2016), http://news.mit.edu/2016/automating-big-data-analysis-1021.
[8] S. Parker and K. Alexeeva, IDC Big Data Analytics Heat Map; Data Driven Intelligence Fuels Australian Innovation (2016), http://www.idc.com/getdoc.jsp?containerId=prAP41647116.
[9] S. Roche, IDC Reveals 53
[10] P. B. Laval, The Mathematics of Big Data (2015), http://math.kennesaw.edu/\( \sim\) plaval/math4490/fall2015/mathsurvey_def_slide.pdf. · Zbl 1359.62022
[11] T. J. Peters, Mathematics in Data Science (2015), http://www.engr.uconn.edu/\( \sim\) tpeters/MaDS.pptx.
[12] Sun, Z. and Xiao, J., Essentials of Discrete Mathematics, Problems and Solutions (Hebei University Press, Baoding, 1994).
[13] Johnsonbaugh, R., Discrete Mathematics, 7th edn. (Pearson Education Limited, 2013).
[14] Burris, S. N. and Sankappanavar, H. P., A Course in Universal Algebra (Springer, 1981). · Zbl 0478.08001
[15] Coronel, C., Morris, S. and Rob, P., Database Systems: Design, Implementation, and Management, 11th edn. (Course Technology, Cengage Learning, Boston, 2015).
[16] Sun, Z., Strang, K. and Yearwood, J., Analytics service oriented architecture for enterprise information systems, in Proc. iiWAS2014 (ACM Press, Hanoi, 2014), pp. 506-518, doi: http://dx.doi.org/10.1145/2684200.2684358.
[17] Sun, Z., Strang, K. and Firmin, S., Business analytics-based enterprise information systems, Journal of Computer Information Systems (2016), doi: 10.1080/08874417. 2016.1183977.
[18] Minelli, M., Chambers, M. and Dhira, A., Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses (John Wiley and Sons, 2013).
[19] Sun, Z., Strang, K. and Li, R., 10 Bigs: A service-oriented foundation for big data, submitted to the Journal of the Association for Information Systems (2017).
[20] IBM, The Four V’s of Big Data (2015), http://www.ibmbigdatahub.com/infographic/four-vs-bigdata.
[21] McAfee, A. and Brynjolfsson, E., Big data: The management revolution, Harvard Business Review90 (10) (2012) 61-68.
[22] Chen, P. P., The entity-relationship model-toward a unified view of data, ACM Transactions on Database Systems1 (1) (1976) 9-36.
[23] Sun, Z., Zou, H. and Strang, K., Big data analytics as a service for business intelligence, in Proc. 14th IFIP Conf. on e-Business, e-Services and e-Society, , Vol. 9373 (Springer, 2015), pp. 200-211, doi: 10.1007/978-3-319-25013-7_16.
[24] Minelli, M., Chambers, M. and Dhiraj, A., Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses, Chinese edn. (Wiley & Sons, 2013).
[25] Sun, Z., Sun, L. and Strang, K., Big data analytics services for enhancing business intelligence, Journal of Computer Information Systems (2016), doi: 10.1080/08874417.2016.1220239.
[26] Codd, E. F., A relational model of data for large shared data banks, The Communications of ACM13 (6) (1970) 377-387. · Zbl 0207.18003
[27] Courant, R., Differential and Integral Calculus Volume I (Blackie & Son, Ltd., Glasgow, 1961).
[28] Sun, Z. and Finnie, G., Experience management in knowledge management, in KES 2005: Knowledge-Based Intelligent Information and Engineering Systems, , Vol. 3681 (Springer-Verlag, Berlin, 2005), pp. 979-986.
[29] Russell, S. and Norvig, P., Artificial Intelligence: A Modern Approach, 3rd edn. (Prentice Hall, Upper Saddle River, 2010). · Zbl 0835.68093
[30] J. E. Kelly, Computing, cognition and the future of knowing (2015), http://www.research.ibm.com/software/IBMResearch/multimedia/Computing_Cognition_WhitePaper.pdf.
[31] Halevy, A., Norvig, P. and Pereira, F., The unreasonable effectiveness of data, IEEE Intelligent Systems24 (2) (2009) 8-12.
[32] T. Jech, Set Theory, The Third Millennium edn., Revised and Expanded (Springer, 2003), https://en.wikipedia.org/wiki/Cardinality_of_the_continuum.
[33] Lang, S., Algebra, Graduate Texts in Mathematics 211, 3rd edn. revised (Springer-Verlag, New York, 2002).
[34] Zadeh, L. A., Fuzzy sets, Information and Control8 (3) (1965) 338-353. · Zbl 0139.24606
[35] Zimmermann, H., Fuzzy Set Theory and its Applications, 4th edn. (Kluwer Academic Publishers, Boston, 2001; Springer Seience+Business Media, New York, 2001).
[36] Zadeh, L. A., Fuzzy sets and information granularity, in Advances in Fuzzy Sets Theory and Applications, eds. Gupta, M., Ragade, R. K. and Yager, R. R. (North-Holland, New York, 1979), pp. 3-18.
[37] Gandomi, A. and Haider, M., Beyond the hype: Big data concepts, methods, and analytics, International Journal of Information Management35 (2015) 137-144.
[38] P. B. Laval, MATH 7900/4490 Math The Mathematics of Big Data (syllabus) (2015), https://math.kennesaw.edu/\( \sim\) plaval/BigData/syllabus.pdf.
[39] P. B. Laval, Introduction to the Mathematics of Big Data (2015), http://math.kennesaw.edu/\( \sim\) plaval/math4490/fall2015/mathsurvey_def.pdf.
[40] ICERM, Mathematics in Data Science (2015), https://icerm.brown.edu/topical_workshops/tw15-6-mds/.
[41] Chui, C. K. and Jiang, Q., Applied Mathematics: Data Compression, Spectral Methods, Fourier Analysis, Wavelets, and Applications (Springer, 2013). · Zbl 1279.65001
[42] Kantardzic, M., Data Mining: Concepts, Models, Methods, and Algorithms (Wiley & IEEE Press, Hoboken, 2011). · Zbl 1232.68001
[43] Larson, R. and Edwards, B. H., Calculus, 9th edn. (Brooks Cole Cengage Learning, 2010).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.