Skip to main content

A Survey on Phishing Website Detection Using Deep Neural Networks

  • Conference paper
  • First Online:
HCI International 2022 – Late Breaking Posters (HCII 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1655))

Included in the following conference series:

  • 1531 Accesses

Abstract

Phishing is a social engineering attack, where an attacker poses as a legitimate individual or institution and convinces a victim to divulge their details through human interaction. There has been a steep rise in phishing cases across the globe. A report by Cisco [1] shows that phishing was the reason for 90% of data breaches in 2021. Various detection models have been proposed in the past to counter such attacks. Some proposed models work on improving the detection rate of phishing URLs while others focus on reducing their detection time. Authors have used machine learning, deep learning, and various other novel mechanisms in feature selections that result in high algorithm performance. This study is a systematic analysis of recent work utilizing deep learning for phishing detection, highlighting the research methods, algorithms, programming tools, and datasets used in such studies. This study further proposes some guidelines for future research, which include standardizing documentation and performance reporting. These guidelines may help researchers in their quest to replicate others’ work and compare newly proposed methods with previously developed systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 99.00
Price excludes VAT (USA)
Softcover Book
USD 129.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. CISCO: cisco threat report 2021. https://umbrella.cisco.com/info/2021-cyber-security-threat-trends-phishing-crypto-top-the-list

  2. Johnson, J.: Phishing - statistics & facts. https://www.statista.com/topics/8385/phishing/

  3. labs, F.: Phishing attacks soar 220% during COVID-19 peak as cybercriminal opportunism intensifies. https://www.f5.com/company/news/features/phishing-attacks-soar-220-during-covid-19-peak-as-cybercriminal

  4. phishing.org: what is phishing. https://www.phishing.org/what-is-phishing

  5. Kitchenham, B.: Procedures for performing systematic reviews. Keele, UK, Keele University 33(2004), 1–26 (2004)

    Google Scholar 

  6. PhishTank: PhishTank. https://phishtank.org/

  7. Crawl: common crawl. https://commoncrawl.org/

  8. Alexa: alexa top sites. https://www.alexa.com/topsites

  9. DMOZ: Dmoz phishing dataset. https://dmoz-odp.org/docs/en/rdf.html

  10. Maurer, M.: Phishload. https://www.medien.ifi.lmu.de/team/max.maurer/files/phishload/index.html

  11. UCI: UCI phishing dataset. https://archive.ics.uci.edu/ml/datasets/phishing+websites

  12. Kaggle: kaggle. https://www.kaggle.com/ahmednour/website-phishing-data-set

  13. Marchal, S: PhishStorm. https://research.aalto.fi/en/datasets/phishstorm-phishing-legitimate-url-dataset

  14. Marchal, S., François, J., State, R., Engel, T.: PhishStorm: detecting phishing with streaming analytics. IEEE Trans. Netw. Serv. Manage. 11(4), 458–471 (2014)

    Article  Google Scholar 

  15. OpenPhish: OpenPhish. https://openphish.com/phishing_database.html

  16. OpenPhish: OpenPhish API. https://github.com/openphish/pyopdb

  17. Kaggle: Kaggle survey 2019. https://www.kaggle.com/kaggle-survey-2019

  18. Brownlee, J.: Best programming language. https://machinelearningmastery.com/best-programming-language-for-machine-learning/

  19. Nagaraj, K., Bhattacharjee, B., Sridhar, A., Sharvani, G.: Detection of phishing websites using a novel twofold ensemble model. J. Sys. Inf. Technol. (2018)

    Google Scholar 

  20. Ozcan, A., Catal, C., Donmez, E., Senturk, B.: A hybrid DNN-LSTM model for detecting phishing URLs. Neural Comput. Appl. 1–17 (2021)

    Google Scholar 

  21. Mourtaji, Y., Bouhorma, M., Alghazzawi, D., Aldabbagh, G., Alghamdi, A.: Hybrid rule-based solution for phishing URL detection using convolutional neural network. Wirel. Commun. Mobile Comput. 2021 (2021)

    Google Scholar 

  22. Korkmaz, M., Kocyigit, E., Sahingoz, O.K., Diri, B.: Phishing web page detection using N-gram features extracted from URLs. In: 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), pp. 1–6. IEEE (2021)

    Google Scholar 

  23. Zhang, Q., Bu, Y., Chen, B., Zhang, S., Lu, X.: Research on phishing webpage detection technology based on CNN-BiLSTM algorithm. In: Journal of Physics: Conference Series, vol. 1738, p. 012131. IOP Publishing (2021)

    Google Scholar 

  24. Yi, P., Guan, Y., Zou, F., Yao, Y., Wang, W., Zhu, T.: Web phishing detection using a deep learning framework. Wirel. Commun. Mobile Comput. 2018 (2018)

    Google Scholar 

  25. Xiao, X., Zhang, D., Hu, G., Jiang, Y., Xia, S.: CNN-MHSA: a convolutional neural network and multi-head self-attention combined approach for detecting phishing websites. Neural Netw. 125, 303–312 (2020)

    Article  Google Scholar 

  26. Liu, D.J., Geng, G.G., Jin, X.B., Wang, W.: An efficient multistage phishing website detection model based on the case feature framework: aiming at the real web environment. Comput. Secur. 110, 102421 (2021)

    Article  Google Scholar 

  27. Wazirali, R., Ahmad, R., Abu-Ein, A.A.K.: Sustaining accurate detection of phishing URLs using SDN and feature selection approaches. Comput. Netw. 201, 108591 (2021)

    Article  Google Scholar 

  28. Saha, I., Sarma, D., Chakma, R.J., Alam, M.N., Sultana, A., Hossain, S.: Phishing attacks detection using deep learning approach. In: 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 1180–1185. IEEE (2020)

    Google Scholar 

  29. Kazienko, P., Lughofer, E., Trawinski, B.: Editorial on the special issue “hybrid and ensemble techniques in soft computing: recent advances and emerging trends’’. Soft. Comput. 19(12), 3353–3355 (2015). https://doi.org/10.1007/s00500-015-1916-x

    Article  Google Scholar 

  30. Sameen, M., Han, K., Hwang, S.O.: PhishHaven-an efficient real-time AI phishing URLs detection system. IEEE Access 8, 83425–83443 (2020)

    Article  Google Scholar 

  31. Ogawa, Y., Kimura, T., Cheng, J.: Vulnerability assessment for deep learning based phishing detection system. In: 2021 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), pp. 1–2. IEEE (2021)

    Google Scholar 

  32. Hashim, A., Medani, R., Attia, T.A.: Defences against web application attacks and detecting phishing links using machine learning. In: 2020 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE), pp. 1–6. IEEE (2020)

    Google Scholar 

  33. AlEroud, A., Karabatis, G.: Bypassing detection of URL-based phishing attacks using generative adversarial deep neural networks. In: Proceedings of the Sixth International Workshop on Security and Privacy Analytics, pp. 53–60 (2020)

    Google Scholar 

  34. Xiao, X., et al.: Phishing websites detection via CNN and multi-head self-attention on imbalanced datasets. Comput. Secur. 108, 102372 (2021)

    Article  Google Scholar 

  35. Zhang, J., Li, X.: Phishing detection method based on borderline-smote deep belief network. In: Wang, G., Atiquzzaman, M., Yan, Z., Choo, K.-K.R. (eds.) SpaCCS 2017. LNCS, vol. 10658, pp. 45–53. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-72395-2_5

    Chapter  Google Scholar 

  36. Pham, T.D., Pham, T.T.T., Hoang, S.T., Ta, V.C.: Exploring efficiency of GAN-based generated URLs for phishing URL detection. In: 2021 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6. IEEE (2021)

    Google Scholar 

  37. Shirazi, H., Bezawada, B., Ray, I., Anderson, C.: Adversarial sampling attacks against phishing detection. In: Foley, S.N. (ed.) DBSec 2019. LNCS, vol. 11559, pp. 83–101. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22479-0_5

    Chapter  Google Scholar 

  38. Somesha, M., Pais, A.R., Rao, R.S., Rathour, V.S.: Efficient deep learning techniques for the detection of phishing websites. Sādhanā 45(1), 1–18 (2020). https://doi.org/10.1007/s12046-020-01392-4

    Article  Google Scholar 

  39. Yang, R., Zheng, K., Wu, B., Wu, C., Wang, X.: Phishing website detection based on deep convolutional neural network and random forest ensemble learning. Sensors 21(24), 8281 (2021)

    Article  Google Scholar 

  40. Jain, A.K., Gupta, B.B.: A machine learning based approach for phishing detection using hyperlinks information. J. Ambient. Intell. Humaniz. Comput. 10(5), 2015–2028 (2019). https://doi.org/10.1007/s12652-018-0798-z

    Article  Google Scholar 

  41. Rao, R.S., Vaishnavi, T., Pais, A.R.: PhishDump: a multi-model ensemble based technique for the detection of phishing sites in mobile devices. Pervasive Mob. Comput. 60, 101084 (2019)

    Article  Google Scholar 

  42. Tajaddodianfar, F., Stokes, J.W., Gururajan, A.: Texception: a character/word-level deep learning model for phishing URL detection. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2857–2861. IEEE (2020)

    Google Scholar 

  43. Zhang, L., Zhang, P.: PhishTrim: fast and adaptive phishing detection based on deep representation learning. In: 2020 IEEE International Conference on Web Services (ICWS), pp. 176–180. IEEE (2020)

    Google Scholar 

  44. Yuan, H., Yang, Z., Chen, X., Li, Y., Liu, W.: URL2vec: URL modeling with character embeddings for fast and accurate phishing website detection. In: 2018 IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp. 265–272. IEEE (2018)

    Google Scholar 

  45. Jawade, J.V., Ghosh, S.N.: Phishing website detection using fast. ai library. In: 2021 International Conference on Communication information and Computing Technology (ICCICT), pp. 1–5. IEEE (2021)

    Google Scholar 

  46. Bu, S.J., Cho, S.B.: Integrating deep learning with first-order logic programmed constraints for zero-day phishing attack detection. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2685–2689. IEEE (2021)

    Google Scholar 

  47. Bozkir, A.S., Aydos, M.: LogoSENSE: a companion HOG based logo detection scheme for phishing web page and E-mail brand recognition. Comput. Secur. 95, 101855 (2020)

    Article  Google Scholar 

  48. Feng, T., Yue, C.: Visualizing and interpreting RNN models in URL-based phishing detection. In: Proceedings of the 25th ACM Symposium on Access Control Models and Technologies, pp. 13–24 (2020)

    Google Scholar 

  49. Wei, B., et al.: A deep-learning-driven light-weight phishing detection sensor. Sensors 19, 4258 (2019). https://doi.org/10.3390/s19194258.https://www.mdpi.com/1424-8220/19/19/4258

  50. Haynes, K., Shirazi, H., Ray, I.: Lightweight URL-based phishing detection using natural language processing transformers for mobile devices. Procedia Comput. Sci. 191, 127–134 (2021)

    Article  Google Scholar 

  51. Yu, X.: Phishing websites detection based on hybrid model of deep belief network and support vector machine. In: IOP Conference Series: Earth and Environmental Science, vol. 602, p. 012001. IOP Publishing (2020)

    Google Scholar 

  52. Zhang, X., Shi, D., Zhang, H., Liu, W., Li, R.: Efficient detection of phishing attacks with hybrid neural networks. In: 2018 IEEE 18th International Conference on Communication Technology (ICCT), pp. 844–848. IEEE (2018)

    Google Scholar 

  53. Adebowale, M.A., Lwin, K.T., Hossain, M.A.: Intelligent phishing detection scheme using deep learning algorithms. J. Enterp. Inf. Manage. (2020)

    Google Scholar 

  54. Yang, P., Zhao, G., Zeng, P.: Phishing website detection based on multidimensional features driven by deep learning. IEEE access 7, 15196–15209 (2019)

    Article  Google Scholar 

  55. Feng, J., Zou, L., Ye, O., Han, J.: Web2vec: phishing webpage detection method based on multidimensional features driven by deep learning. IEEE Access 8, 221214–221224 (2020)

    Article  Google Scholar 

  56. Sumathi, K., Sujatha, V.: Deep learning based-phishing attack detection. Int. J. Recent Technol. Eng. (IJRTE) 8(3) (2019)

    Google Scholar 

  57. Lakshmi, L., Reddy, M.P., Santhaiah, C., Reddy, U.J.: Smart phishing detection in web pages using supervised deep learning classification and optimization technique ADAM. Wireless Pers. Commun. 118(4), 3549–3564 (2021). https://doi.org/10.1007/s11277-021-08196-7

    Article  Google Scholar 

  58. Soon, G.K., Chiang, L.C., On, C.K., Rusli, N.M., Fun, T.S.: Comparison of ensemble simple feedforward neural network and deep learning neural network on phishing detection. In: Alfred, R., Lim, Y., Haviluddin, H., On, C.K. (eds.) Computational Science and Technology. LNEE, vol. 603, pp. 595–604. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0058-9_57

    Chapter  Google Scholar 

  59. Su, Y.: Research on website phishing detection based on LSTM RNN. In: 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), vol. 1, pp. 284–288. IEEE (2020)

    Google Scholar 

  60. de Souza, C.H.M., Lemos, M.O.O., da Silva, F.S.D., Alves, R.L.S.: On detecting and mitigating phishing attacks through featureless machine learning techniques. Internet Technol. Lett. 3(1), e135 (2020)

    Article  Google Scholar 

  61. Bartoli, A., De Lorenzo, A., Medvet, E., Tarlao, F.: Personalized, browser-based visual phishing detection based on deep learning. In: Zemmari, A., Mosbah, M., Cuppens-Boulahia, N., Cuppens, F. (eds.) CRiSIS 2018. LNCS, vol. 11391, pp. 80–85. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12143-3_7

    Chapter  Google Scholar 

  62. Al-Alyan, A., Al-Ahmadi, S.: Robust URL phishing detection based on deep learning. KSII Trans. Internet Inf. Syst. (TIIS) 14(7), 2752–2768 (2020)

    Google Scholar 

  63. Singh, S., Singh, M., Pandey, R.: Phishing detection from URLs using deep learning approach. In: 2020 5th International Conference on Computing, Communication and Security (ICCCS), pp. 1–4. IEEE (2020)

    Google Scholar 

  64. Aljofey, A., Jiang, Q., Qu, Q., Huang, M., Niyigena, J.P.: An effective phishing detection model based on character level convolutional neural network from URL. Electronics 9(9), 1514 (2020)

    Article  Google Scholar 

  65. Yerima, S.Y., Alzaylaee, M.K.: High accuracy phishing detection based on convolutional neural networks. In: 2020 3rd International Conference on Computer Applications & Information Security (ICCAIS), pp. 1–6. IEEE (2020)

    Google Scholar 

  66. Dutta, A.K.: Detecting phishing websites using machine learning technique. PLoS ONE 16(10), e0258361 (2021)

    Article  Google Scholar 

  67. Bahnsen, A.C., Bohorquez, E.C., Villegas, S., Vargas, J., González, F.A.: Classifying phishing URLs using recurrent neural networks. In: 2017 APWG Symposium on Electronic Crime Research (eCrime), pp. 1–8. IEEE (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Vivek Sharma or Tzipora Halevi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sharma, V., Halevi, T. (2022). A Survey on Phishing Website Detection Using Deep Neural Networks. In: Stephanidis, C., Antona, M., Ntoa, S., Salvendy, G. (eds) HCI International 2022 – Late Breaking Posters. HCII 2022. Communications in Computer and Information Science, vol 1655. Springer, Cham. https://doi.org/10.1007/978-3-031-19682-9_87

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19682-9_87

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19681-2

  • Online ISBN: 978-3-031-19682-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics