Skip to main content

Data Virtualization Layer Key Role in Recent Analytical Data Architectures

  • Conference paper
  • First Online:
Intelligent Systems Design and Applications (ISDA 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 716))

  • 326 Accesses

Abstract

The amount of data, its heterogeneity and the speed at which it is generated are increasingly diverse and the current systems are not able to handle on-demand real-time data access. In traditional data integration approaches such as ETL, physically loading the data into data stores that use different technologies is becoming costly, time-consuming, inefficient, and a bottleneck. Recently, data virtualization has been used to accelerate the data integration process and provides a solution to previous challenges by delivering a unified, integrated, and holistic view of trusted data, on-demand and in real-time. This paper provides an overview of traditional data integration, in addition to its limits. We discuss data virtualization, its core capabilities and features, how it can complement other data integration approaches, and how it improves traditional data architecture paradigms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 219.00
Price excludes VAT (USA)
Softcover Book
USD 279.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The European Unions General Data Protection Regulation.

  2. 2.

    Open Data Protocol.

References

  1. Alagiannis, I., Borovica, R., Branco, M., Idreos, S., Ailamaki, A.: NoDB: efficient query execution on raw data files. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 241–252 (2012)

    Google Scholar 

  2. Armbrust, M., Ghodsi, A., Xin, R., Zaharia, M.: Lakehouse: a new generation of open platforms that unify data warehousing and advanced analytics. In: Proceedings of CIDR (2021)

    Google Scholar 

  3. Behm, A., et al.: Photon: a fast query engine for Lakehouse systems. In: Proceedings of the 2022 International Conference on Management of Data, pp. 2326–2339 (2022)

    Google Scholar 

  4. Bogdanov, A., Degtyarev, A., Shchegoleva, N., Khvatov, V.: On the way from virtual computing to virtual data processing. In: CEUR Workshop Proceedings, pp. 25–30 (2020)

    Google Scholar 

  5. Bogdanov, A., Degtyarev, A., Shchegoleva, N., Khvatov, V., Korkhov, V.: Evolving principles of big data virtualization. In: Gervasi, O., et al. (eds.) ICCSA 2020. LNCS, vol. 12254, pp. 67–81. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58817-5_6

    Chapter  Google Scholar 

  6. Bogdanov, A., Degtyarev, A., Shchegoleva, N., Korkhov, V., Khvatov, V.: Big data virtualization: why and how? In: CEUR Workshop Proceedings (2679), pp. 11–21 (2020)

    Google Scholar 

  7. Chatziantoniou, D., Kantere, V.: Datamingler: a novel approach to data virtualization. In: Proceedings of the 2021 International Conference on Management of Data, pp. 2681–2685 (2021)

    Google Scholar 

  8. Earley, S.: Data virtualization and digital agility. IT Professional 18(5), 70–72 (2016)

    Article  Google Scholar 

  9. Eryurek, E., Gilad, U., Lakshmanan, V., Kibunguchy-Grant, A., Ashdown, J.: Data governance: the definitive guide. “O’ Reilly Media, Inc.” (2021)

    Google Scholar 

  10. Gartner: Definition of dark data - it glossary. https://www.gartner.com/en/information-technology/glossary/dark-data. Accessed 14 Apr 2022

  11. Gorelik, A.: The enterprise big data lake: delivering the promise of big data and data science. O’Reilly Media (2019)

    Google Scholar 

  12. Gottlieb, M., Shraideh, M., Fuhrmann, I., Böhm, M., Krcmar, H.: Critical success factors for data virtualization: a literature review. ISC Int. J. Inf. Secur. 11(3), 131–137 (2019)

    Google Scholar 

  13. Guo, S.S., Yuan, Z.M., Sun, A.B., Yue, Q.: A new ETL approach based on data virtualization. J. Comput. Sci. Technol. 30(2), 311–323 (2015)

    Article  Google Scholar 

  14. Halevy, A., Doan, A.: Zgi (autor). Principles of data integration (2012)

    Google Scholar 

  15. Hilger, J., Wahl, Z.: Graph databases. In: Making Knowledge Management Clickable, pp. 199–208. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-92385-3_13

  16. Kukreja, M.: Data engineering with apache spark, delta lake, and Lakehouse. “Packt Publishing Ltd.” (2021)

    Google Scholar 

  17. Van der Lans, R.F.: Creating an agile data integration platform using data virtualization. R20/Consultancy technical white paper (2014)

    Google Scholar 

  18. Van der Lans, R.F.: Architecting the multi-purpose data lake with data virtualization. Denodo (2018)

    Google Scholar 

  19. Lennerholt, C., van Laere, J., Söderström, E.: Implementation challenges of self service business intelligence: a literature review. In: 51st Hawaii International Conference on System Sciences, Hilton Waikoloa Village, Hawaii, USA, 3-6 Jan 2018, vol. 51, pp. 5055–5063. IEEE Computer Society (2018)

    Google Scholar 

  20. LEsteve, R.: Adaptive query execution. In: The Azure Data Lakehouse Toolkit, pp. 327–338. Springer (2022). https://doi.org/10.1007/978-1-4842-8233-5_14

  21. Menge, F.: Enterprise service bus. In: Free and open source software conference, vol. 2, pp. 1–6 (2007)

    Google Scholar 

  22. Miller, L.C.: Data Virtualization For Dummies, Denodo Special Edition. “John Wiley & Sons, Ltd.” (2018)

    Google Scholar 

  23. Mousa, A.H., Shiratuddin, N.: Data warehouse and data virtualization comparative study. In: 2015 International Conference on Developments of E-Systems Engineering (DeSE), pp. 369–372. IEEE (2015)

    Google Scholar 

  24. Mousa, A.H., Shiratuddin, N., Bakar, M.S.A.: Virtual data mart for measuring organizational achievement using data virtualization technique (KPIVDM). J. Teknologi 68(3), 2932 (2014)

    Google Scholar 

  25. Muniswamaiah, M., Agerwala, T., Tappert, C.: Data virtualization for analytics and business intelligence in big data. In: CS & IT Conference Proceedings. CS & IT Conference Proceedings (2019)

    Google Scholar 

  26. Offia, C.E.: Using logical data warehouse in the process of big data integration and big data analytics in organisational sector, Ph. D. thesis, University of the West of Scotland (2021)

    Google Scholar 

  27. Oussous, A., Benjelloun, F.Z., Lahcen, A.A., Belfkih, S.: Big data technologies: a survey. J. King Saud Univ.-Comput. Inf. Sci. 30(4), 431–448 (2018)

    Google Scholar 

  28. Papadopoulos, T., Balta, M.E.: Climate change and big data analytics: challenges and opportunities. Int. J. Inf. Manage. 63, 102448 (2022)

    Article  Google Scholar 

  29. Raguseo, E.: Big data technologies: an empirical investigation on their adoption, benefits and risks for companies. Int. J. Inf. Manage. 38(1), 187–195 (2018)

    Article  Google Scholar 

  30. Reinsel, D., Gantz, J., Rydning, J.: The digitization of the world from edge to core. Framingham: International Data Corporation, p. 16 (2018)

    Google Scholar 

  31. Sarkar, P.: Data as a service: a framework for providing reusable enterprise data services. John Wiley & Sons (2015)

    Google Scholar 

  32. Satio, K., Maita, N., Watanabe, Y., Kobayashi, A.: Data virtualization for data source integration. IEICE Technical Report; IEICE Tech. Rep. 116(137), 37���41 (2016)

    Google Scholar 

  33. Shraideh, M., Gottlieb, M., Kienegger, H., Böhm, M., Krcmar, H., et al.: Decision support for data virtualization based on fifteen critical success factors: a methodology. In: MWAIS 2019 Proceedings (2019)

    Google Scholar 

  34. Skluzacek, T.J.: Automated metadata extraction can make data swamps more navigable, Ph. D. thesis, The University of Chicago (2022)

    Google Scholar 

  35. Stein, B., Morrison, A.: The enterprise data lake: better integration and deeper analytics. PwC Technol. Forecast: Rethinking Integr. 1(1–9), 18 (2014)

    Google Scholar 

  36. Zaidi, E., Menon, S., Thanaraj, R., Showell, N.: Magic quadrant for data integration tools. Technical report G00758102, Gartner, Inc. (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Montasser Akermi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Akermi, M., Hadj Taieb, M.A., Ben Aouicha, M. (2023). Data Virtualization Layer Key Role in Recent Analytical Data Architectures. In: Abraham, A., Pllana, S., Casalino, G., Ma, K., Bajaj, A. (eds) Intelligent Systems Design and Applications. ISDA 2022. Lecture Notes in Networks and Systems, vol 716. Springer, Cham. https://doi.org/10.1007/978-3-031-35501-1_42

Download citation

Publish with us

Policies and ethics