Abstract
In recent years, PowerShell has increasingly been reported as appearing in a variety of cyber attacks. However, because the PowerShell language is dynamic by design and can construct script fragments at different levels, state-of-the-art static analysis based PowerShell attack detection approaches are inherently vulnerable to obfuscations. In this paper, we design the first generic, effective, and lightweight deobfuscation approach for PowerShell scripts. To precisely identify the obfuscated script fragments, we define obfuscation based on the differences in the impacts on the abstract syntax trees of PowerShell scripts and propose a novel emulation-based recovery technology. Furthermore, we design the first semantic-aware PowerShell attack detection system that leverages the classic objective-oriented association mining algorithm and newly identifies 31 semantic signatures. The experimental results on 2342 benign samples and 4141 malicious samples show that our deobfuscation method takes less than 0.5 s on average and increases the similarity between the obfuscated and original scripts from 0.5% to 93.2%. By deploying our deobfuscation method, the attack detection rates for Windows Defender and VirusTotal increase substantially from 0.33% and 2.65% to 78.9% and 94.0%, respectively. Moreover, our detection system outperforms both existing tools with a 96.7% true positive rate and a 0% false positive rate on average.
摘要
近年来, PowerShell攻击越来越多见诸报道. 然而, 由于PowerShell语言的动态特性, 且可在不同级别构造脚本片段, 即使基于最先进的静态脚本分析的PowerShell攻击检测方法, 其本质上也容易受到混淆的影响. 本文为PowerShell脚本设计了一种通用、有效且轻量的去混淆方法. 首先, 为精准识别模糊脚本片段, 根据混淆方法对PowerShell抽象语法树的影响, 提出一种全新混淆片段检测方法, 在此基础上提出一种基于仿真的恢复技术. 此外, 设计了一个语义敏感的PowerShell攻击检测系统, 该系统利用经典的面向目标的关联挖掘算法, 新识别31个用于恶意脚本检测的语义特征. 在2342个良性样本和4141个恶意样本上的实验结果表明, 所提去混淆方法平均耗时不到0.5秒, 且将模糊脚本和原始脚本的相似度从0.5%提至93.2%. 采用该去混淆方法, Windows Defender和VirusTotal的攻击检测率分别从0.33%和2.65%提至78.9%和94.0%. 实验还表明, 我们的检测系统优于现有两种工具(平均真正例率为96.7%, 假正例率为0%).
Similar content being viewed by others
References
AbdelKhalek M, Shosha A, 2017. JSDES: an automated de-obfuscation system for malicious JavaScript. Proc 12th Int Conf on Availability, Reliability and Security, p.1–13. https://doi.org/10.1145/3098954.3107009
Ackerman G, Cole R, Thompson A, et al., 2018. OVERRULED: Containing a Potentially Destructive Adversary. https://bit.ly/2tSUacy [Accessed on Aug. 8, 2020].
Acornjs, 2013. Acorn. https://bit.ly/2BPzkyw [Accessed on Aug. 8, 2020].
Aebersold S, Kryszczuk K, Paganoni S, et al., 2016. Detecting obfuscated JavaScript using machine learning. 11th Int Conf on Internet Monitoring and Protection, p.11–17.
Ahl I, 2017. Threat Research: Privileges and Credentials: Phished at the Request of Counsel. https://bit.ly/2RaIk5o [Accessed on Aug. 8, 2020].
AST Explorer, 2015. AST Explorer. https://astexplorer.net/ [Accessed on Aug. 8, 2020].
Barak B, Goldreich O, Impagliazzo R, et al., 2012. On the (im)possibility of obfuscating programs. J ACM, 59(2):6. https://doi.org/10.1145/2160158.2160159
Bohannon D, 2016. Invoke-Obfuscation. https://bit.ly/2TIEwLN [Accessed on Aug. 8, 2020].
Bohannon D, 2017a. ObfuscatedEmpire—Use an Obfuscated, In-memory PowerShell C2 Channel to Evade AV Signatures. https://bit.ly/36UVYjC [Accessed on Aug. 8, 2020].
Bohannon D, 2017b. PowerShellObfuscation Detection Framework. https://bit.ly/2RhakUP [Accessed on Aug. 8, 2020].
Borgelt C, 2005. An implementation of the FP-growth algorithm. Proc 1st Int Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, p.1–5. https://doi.org/10.1145/1133905.1133907
Canali D, Cova M, Vigna G, et al., 2011. Prophiler: a fast filter for the large-scale detection of malicious web pages. Proc 20th Int Conf on World Wide Web, p.197–206. https://doi.org/10.1145/1963405.1963436
Candid W, 2016. The Increased Use of PowerShell in Attacks. https://symc.ly/2NmazwO [Accessed on Aug. 8, 2020].
Christodorescu M, Jha S, Seshia SA, et al., 2005. Semantics-aware malware detection. Proc IEEE Symp on Security and Privacy, p.32–46. https://doi.org/10.1109/SP.2005.20
Cova M, Kruegel C, Vigna G, 2010. Detection and analysis of drive-by-download attacks and malicious JavaScript code. Proc 19th Int Conf on World Wide Web, p.281–290. https://doi.org/10.1145/1772690.1772720
CrowdStrike, 2014. Free Automated Malware Analysis Service. https://bit.ly/36SUUgd [Accessed on Aug. 8, 2020].
CrowdStrike, 2018. Who Needs Malware? How Adversaries Use Fileless Attacks to Evade Your Security. https://bit.ly/2HZB23i [Accessed on Aug. 8, 2020].
Curtsinger C, Livshits B, Zorn B, et al., 2011. ZOZZLE: fast and precise in-browser JavaScript malware detection. Proc 20th USENIX Conf on Security, p.33–48.
Diggs R, 2017. Pulling Back the Curtains on Encoded-Command PowerShell Attacks. https://bit.ly/30jVNMr [Accessed on Aug. 8, 2020].
EmpireProject, 2015. Empire Is a PowerShell and Python Post-Exploitation Agent. https://bit.ly/36P13du [Accessed on Aug. 8, 2020].
FOLDOC, 1994. Free On-line Dictionary of Computing: Abstract Syntax Tree. https://foldoc.org/abstract+syntax+tree [Accessed on Aug. 8, 2020].
Fredrikson M, Jha S, Christodorescu M, et al., 2010. Synthesizing near-optimal malware specifications from suspicious behaviors. Proc IEEE Symp on Security and Privacy, p.45–60. https://doi.org/10.1109/SP.2010.11
Google, 2004. VirusTotal. https://bit.ly/3a3Pfpz [Accessed on Aug. 8, 2020].
Google, 2011. Traceur-Compiler. https://bit.ly/2BW2hZP [Accessed on Aug. 8, 2020].
Hendler D, Kels S, Rubin A, 2018. Detecting malicious PowerShell commands using deep neural networks. Proc Asia Conf on Computer and Communications Security, p.187–197. https://doi.org/10.1145/3196494.3196511
Hidayat A, 2012. ECMAScript Parsing Infrastructure for Multipurpose Analysis. https://esprima.org/ [Accessed on Aug. 8, 2020].
Jodavi M, Abadi M, Parhizkar E, 2015. JSObfusDetector: a binary PSO-based one-class classifier ensemble to detect obfuscated JavaScript code. Proc Int Symp on Artificial Intelligence and Signal Processing, p.322–327. https://doi.org/10.1109/AISP.2015.7123508
Kachalov T, 2016. JavaScript-Obfuscator. https://bit.ly/3cSvP7a [Accessed on Aug. 8, 2020].
Kannumittal, 2018. Difference b/w a Programming & Scripting Language. https://www.codingninjas.com/blog/2018/12/08/difference-between-a-programming-language-and-a-scripting-language/
Kaplan S, Livshits B, Zorn B, et al., 2011. “NOFUS: Automatically Detecting” String.fromCharCode(32) “ObFuSCateD” to LowerCase() “JavaScript Code”. Technical Report MSR-TR 2011-57. Microsoft Research.
Koschke R, Falke R, Frenzel P, 2006. Clone detection using abstract syntax suffix trees. Proc 13th Working Conf on Reverse Engineering, p.253–262. https://doi.org/10.1109/WCRE.2006.18
Li ZY, Chen QA, Xiong CL, et al., 2019. Effective and lightweight deobfuscation and semantic-aware attack detection for PowerShell scripts. Proc ACM SIGSAC Conf on Computer and Communications Security, p.1831–1847. https://doi.org/10.1145/3319535.3363187
Liu C, Xia B, Yu M, et al., 2018. PSDEM: a feasible deobfuscation method for malicious PowerShell detection. Proc IEEE Symp on Computers and Communications, p.825–831. https://doi.org/10.1109/ISCC.2018.8538691
Lu G, Debray S, 2012. Automatic simplification of obfuscated JavaScript code: a semantics-based approach. Proc IEEE 6th Int Conf on Software Security and Reliability, p.31–40. https://doi.org/10.1109/SERE.2012.13
Maniar V, 2018. PowerShell-RAT. https://bit.ly/2uOD7ZH [Accessed on Aug. 8, 2020].
Mateas M, Montfort N, 2005. A box, darkly: obfuscation, weird languages, and code aesthetics. Proc 6th Digital Arts and Culture Conf, p.144–153.
Microsoft, 2014. Submit a File for Malware Analysis—Microsoft Security Intelligence. https://bit.ly/2TgVYXo [Accessed on Aug. 8, 2020].
Microsoft, 2019. Antimalware Scan Interface (AMSI). https://bit.ly/3hHhXBJ [Accessed on Aug. 8, 2020].
Mishoo, 2015. UglifyJS. https://bit.ly/30wOWkM [Accessed on Aug. 8, 2020].
MITRE, 2015. MITRE ATT&CK. https://attack.mitre.org/ [Accessed on Aug. 8, 2020].
MITRE, 2020. Technique: PowerShell-MITRE ATT&CKTM. https://bit.ly/36SVSsR [Accessed on Aug. 8, 2020].
PowerShellMafia, 2012. PowerSploit: a PowerShell Post-Exploitation Framework—PowerShellMafia/PowerSploit. https://bit.ly/36STQJ9 [Accessed on Aug. 8, 2020].
R3MRUM, 2018. PowerShell Script for Deobfuscating Encoded PowerShell Scripts: R3mrum/PSDecode https://github.com/R3MRUM/PSDecode [Accessed on Aug. 8, 2020].
Reactor NET, 2003. Code Virtualization. https://www.eziriz.com [Accessed on Aug. 8, 2020].
Rieck K, Krueger T, Dewald A, 2010. Cujo: efficient detection and prevention of drive-by-download attacks. Proc 26th Annual Computer Security Applications Conf, p.31–39. https://doi.org/10.1145/1920261.1920267
Rubin A, Kels S, Hendler D, 2019. AMSI-based detection of malicious PowerShell code using contextual embeddings. https://arxiv.org/abs/1905.09538
Rusak G, Al-Dujaili A, O’Reilly UM, 2018. AST-based deep learning for detecting malicious PowerShell. Proc ACM SIGSAC Conf on Computer and Communications Security, p.2276–2278. https://doi.org/10.1145/3243734.3278496
Samratashok, 2020. What Is PowerShell? https://bit.ly/3f8U5DS [Accessed on Aug. 8, 2020].
Scraper W, 2019. Web Scraper. https://www.webscraper.io/ [Accessed on Aug. 8, 2020].
ShapeSecurity, 2015. Shift-parser-js. https://bit.ly/3fe0HRj [Accessed on Aug. 8, 2020].
Shen YD, Zhang Z, Yang Q, 2002. Objective-oriented utility-based association mining. Proc IEEE Int Conf on Data Mining, p.426–433. https://doi.org/10.1109/ICDM.2002.1183938
Symantec, 2018. Security Center White Papers | Symantec. https://symc.ly/2TlKphr [Accessed on Aug. 8, 2020].
Tobias W, 2018. New Obfuscation Modes. https://bit.ly/2FJhJae [Accessed on Aug. 8, 2020].
Ugarte D, Maiorca D, Cara F, et al., 2019. PowerDrive: accurate de-obfuscation and analysis of PowerShell malware. Proc 16th Int Conf on Detection of Intrusions and Malware, and Vulnerability Assessment, p.240–259. https://doi.org/10.1007/978-3-030-22038-9_12
Wueest C, Anand H, 2017. ISTR Living off the Land and Fileless Attack Techniques. https://symc.ly/2FP6v3X [Accessed on Aug. 8, 2020].
Wueest C, Stephen D, 2016. The Increased Use of PowerShell in Attacks. https://symc.ly/35Qj1ef [Accessed on Aug. 8, 2020].
Xiong CL, Zhu TT, Dong WH, et al., 2022. Conan: a practical real-time APT detection system with high accuracy and efficiency. IEEE Trans Depend Sec Comput, 19(1):551–565. https://doi.org/10.1109/TDSC.2020.2971484
Xu W, Zhang FF, Zhu SC, 2012. The power of obfuscation techniques in malicious JavaScript code: a measurement study. Proc 7th Int Conf on Malicious and Unwanted Software, p.9–16. https://doi.org/10.1109/MALWARE.2012.6461002
Ye YF, Wang DD, Li T, et al., 2008. An intelligent PE-malware detection system based on association mining. J Comput Virol, 4(4):323–334. https://doi.org/10.1007/s11416-008-0082-4
Author information
Authors and Affiliations
Contributions
Chunlin XIONG and Zhenyuan LI designed the research. Tiantian ZHU and Hai YANG investigated the background. Jian WANG and Hai YANG processed the data. Zhenyuan LI and Chunlin XIONG drafted the paper. Wei RUAN and Tiantian ZHU helped organize the paper. Yan CHEN, Tiantian ZHU, and Wei RUAN revised and finalized the paper.
Corresponding author
Ethics declarations
Chunlin XIONG, Zhenyuan LI, Yan CHEN, Tiantian ZHU, Jian WANG, Hai YANG, and Wei RUAN declare that they have no conflict of interest.
Additional information
Project supported by the National Natural Science Foundation of China (No. U1936215)
Rights and permissions
About this article
Cite this article
Xiong, C., Li, Z., Chen, Y. et al. Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts. Front Inform Technol Electron Eng 23, 361–381 (2022). https://doi.org/10.1631/FITEE.2000436
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2000436