Abstract
A website fingerprinting (WF) attack is a type of traffic analysis technique that extracts the unique fingerprint of the traffic visiting a website, demonstrating that the current privacy protection mechanism provided by https is still fragile. Whereas prior WF attack methods that extract fingerprints using the Web traffic generated by the first TCP flow can easily be compromised by frequent website updates, we observe that it is still possible to identify a website accurately by fingerprinting the resource loading sequence generated by multiple TCP flows. We record the multiple TCP flows during a website visit and analyse their traffic structure. We find that despite the updates to the website, the TCP establishment is usually kept unchanged, and the TCP sequence can be used to fingerprint a website. Hence, we use multiple TCP flows for website fingerprinting attacks and demonstrate their high accuracy in recognizing a website even under https protection. We collect data from 20 websites within a time span of six months and show that the accuracy and robustness are significantly higher than those of state-of-the-art WF solutions.
Similar content being viewed by others
Notes
The browser is Chrome (Version 101.0.1210.47), and the visiting time is Jul.25th, 2022
We construct RLSeq using a window-sliding way, that is, a fresh RLSeq structure is constructed once the new flows comes.
References
Alexa website ranking. https://www.alexa.com/. Accessed 6 May 2021
Cisco joy. https://github.com/cisco/joy. Accessed 17 Aug 2021
Google transparency report. https://transparencyreport.google.com/https/overview. Accessed 25 Feb 2022
Rfc 7540: Hypertext transfer protocol version 2 (http/2). https://www.rfc-editor.org/rfc/rfc7540.html. Accessed 30 Mar 2022
Selenium, automating Web applications for testing purposes tools. https://www.selenium.dev/. Accessed 17 Aug 2021
Dong, C., Lu, Z., Cui, Z., Liu, B., Chen, K.: MBtree: Detecting encryption rats communication using malicious behavior tree. IEEE Trans. Inf. Forensic. Secur. (TIFS) 16(1), 3589–3603 (2021)
Ede, T., Bortolameotti, R., Continella, A., Ren, J., Dubois, D., Lindorfer, M., Choffnes, D., Steen, M., Peter, A.: Flowprint: Semi-supervised mobile-app fingerprinting on encrypted network traffic. In: Network and Distributed System Security (NDSS) (2020)
Saman, F., Leith, D.J.: A Web traffic analysis attack using only timing information. IEEE Trans. Inf. Forensic. Secur. (TIFS) 11(8), 1747–1759 (2016)
Gezer, A., Warner, G., Wilson, C., Shrastra, P.: A flow-based approach for Trickbot banking trojan detection. Comput. Secur. 84 (2019)
Hayes, J., Danezis, G.: K-fingerprinting: A robust scalable website fingerprinting technique. In: Proceeding of the USENIX Security Symposium, pp 1187–1203 (2016)
Herrmann, D., Wendolsky, R., Federrath, H.: Website fingerprinting: Attacking popular privacy enhancing technologies with the multinomial nave-bayes classifier. In: ACM Workshop on Cloud Computing Security (CCSW), pp 31–42 (2009)
Gong, J., Wang, T.: Zero-delay lightweight defenses against website fingerprinting. In: USENIX Security Symposium (USENIX Security), pp 717–734 (2020)
Jahani, H., Jalili, S.: A novel passive website fingerprinting attack on TOR using fast fourier transform. Comput. Commun. (CC) 96(1), 43–51 (2016)
Keogh, E.J., Pazzani, M.J.: Derivative Dynamic Time Warping, pp. 1–11
Korczyński, M., Duda, A.: Markov chain fingerprinting to classify encrypted traffic. In: IEEE Conference on Computer Communications (INFOCOM), pp 781–789 (2014)
Jie, L., Anjin, L., Fan, D., et al.: Learning under concept drift: A review. IEEE Trans. Knowl. Data Eng. (TKDE) 31(12), 2346–2363 (2019)
Nayak, S., Misra, B.B., Behera, H.S.: Impact of data normalization on stock index forecasting. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 6(1), 257–269 (2014)
Nie, L, Zhao, L, Li, K.: Robust anomaly detection using reconstructive adversarial network. IEEE Trans. Netw. Serv. Manag. (TNSM) 18(2), 1899–1912 (2021)
Panchenko, A., Lanze, F., Zinnen, A., Henze, M., Engel, T.: Website fingerprinting at internet scale. In: ISOC Network & Distributed System Security Symposium (NDSS), pp 1–18 (2016)
Roei, S, Vitaly, S, Eran, T.: Beauty and the burst: Remote identification of encrypted video streams. In: USENIX Security Symposium (USENIX Security), pp 1357–1374 (2017)
Yi, S., Kanta, M.: Fingerprinting attack on the TOR anonymity system. In: International Conference on Information and Communications Security (ICICS), pp 425–438 (2009)
Sepp, H, Jürgen, S.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Shen, M., Zhang, J., Zhu, L., Xu, K., Du, X.: Accurate decentralized application identification via encrypted traffic analysis using graph neural networks. IEEE Trans. Inf. Forensic. Secur. (TIFS) 16(1), 2367–2380 (2021)
Shen, M., Liu, Y., Zhu, L., Du, X., Hu, J.: Fine-grained webpage fingerprinting using only packet length information of encrypted traffic. IEEE Trans. Inf. Forensic. Secur. (TIFS) 16(1), 2046–2059 (2021)
Shi, X., Chen, Z., Wang, H., Yeung, D. Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In: International Conference on Neural Information Processing Systems (NIPS), pp 802–810 (2015)
Siby, S., Juarez, M., Diaz, C., Troncoso, C., Vallina-Rodriguez, N.: Encrypted dns → privacy: A traffic analysis perspective. In: ISOC Network and Distributed System Security Symposium (NDSS), pp 1–18 (2020)
Sirinam, P., Imani, M., Juarez, M., Wright, M.: Deep fingerprinting: Undermining website fingerprinting defenses with deep learning. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp 1928–1943 (2018)
Taylor, V.F., Spolaor, R., Conti, M., Martinovic, I.: Robust smartphone app identification via encrypted network traffic analysis. IEEE Trans. Inf. Forensic. Secur. (TIFS) 13(1) (2017)
Xie, J., Li, S., Zhang, Y., Yun, X., Li, J.: A method based on hierarchical spatiotemporal features for Trojan traffic detection. In: 2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC), pp 1–8 (2019)
Zhang, Z., Kang, C., Xiong, G., Li, Z.: Deep forest with LRRS feature for fine-grained website fingerprinting with encrypted SSL/TLS. In: ACM International Conference on Information and Knowledge Management (CIKM), pp 851–860 (2019)
Funding
This work is supported in part by the National Key Research and Development Program of China No. 2019QY1301; the NSFC-General Technology Basic Research Joint Funds under Grant U1836214; NSFC-61872265; the New Generation of Artificial Intelligence Science and Technology Major Project of Tianjin under 19ZXZNGX00010.
Author information
Authors and Affiliations
Contributions
Changzhi Li prepared experiments and figures; Lihai Nie prepared main manuscript text; All authors reviewed the manuscript.
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Changzhi Li and Lihai Nie contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, C., Nie, L., Zhao, L. et al. Robust website fingerprinting through resource loading sequence. World Wide Web 26, 2329–2349 (2023). https://doi.org/10.1007/s11280-023-01138-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-023-01138-2