Abstract
Malicious web pages are among the major security threats on the Web. Most of the existing techniques for detecting malicious web pages focus on specific attacks. Unfortunately, attacks are getting more complex whereby attackers use blended techniques to evade existing countermeasures. In this paper, we present a holistic and at the same time lightweight approach, called BINSPECT, that leverages a combination of static analysis and minimalistic emulation to apply supervised learning techniques in detecting malicious web pages pertinent to drive-by-download, phishing, injection, and malware distribution by introducing new features that can effectively discriminate malicious and benign web pages. Large scale experimental evaluation of BINSPECT achieved above 97% accuracy with low false signals. Moreover, the performance overhead of BINSPECT is in the range 3-5 seconds to analyze a single web page, suggesting the effectiveness of our approach for real-life deployment.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Symantec: Symantec report on attack kits and malicious websites (July 2011), http://symantec.com/content/en/us/enterprise/other_resources/b-symantec_report_on_attack_kits_and_malicious_websites_21169171_WP.en-us.pdf
Symantec: Symantec web based attack prevalence report (July 2011), http://www.symantec.com/business/threatreport/topic.jsp?id=threat_activity_trends&aid=web_based_attack_prevalence
WebSense: Websense 2010 threat report (July 2011), http://www.websense.com/content/threat-report-2010-highlights.aspx/
Symantec: Internet security threat report 2011 trends (April 2012), http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_2011_21239364.en-us.pdf
Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler:a fast filter for the large-scale detection of malicious web pages. In: Proceedings of WWW, pp. 197–206 (2011)
Stone-Gross, B., Cova, M., Cavallaro, L., Gilbert, B., Szydlowski, M., Kemmerer, R., Kruegel, C., Vigna, G.: Your botnet is my botnet: analysis of a botnet takeover. In: Proceedings of the 16th ACM CCS, pp. 635–647 (2009)
Eshete, B., Villafiorita, A., Weldemariam, K.: Malicious website detection: Effectiveness and efficiency issues. In: Proceedings of SysSec Workshop, pp. 123–126 (2011)
Ma, J.: Learning to Detect Malicious URLs. PhD thesis, University of California, San Diego (2010)
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious urls: an application of large-scale online learning. In: Proceedings of ICML, pp. 681–688 (2009)
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious urls. In: Proceedings of KDDM (2009)
Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and Evaluation of a Real-Time URL Spam Filtering Service. In: Proceedings of the IEEE Symposium on Security and Privacy (2011)
Choi, H., Zhu, B.B., Lee, H.: Detecting malicious web links and identifying their attack types. In: Proceedings of the 2nd USENIX Conference on Web Application Development, pp. 11–11 (2011)
Seifert, C., Welch, I., Komisarczuk, P., Aval, C.U., Endicott-Popovsky, B.: Identification of malicious web pages through analysis of underlying dns and web server relationships. In: 33rd IEEE Conference on Local Computer Networks (2008)
Yung-Tsung, H., Yimeng, C., Tsuhan, C., Chi-Sung, L., Chia-Mei, C.: Malicious web content detection by machine learning. Expert Syst. Appl. 37(1), 55–60 (2010)
Seifert, C., Welch, I., Komisarczuk, P.: Identification of malicious web pages with static heuristics. In: Proceedings of the Australasian Telecommunication Networks and Applications Conference (2008)
Likarish, P., Jung, E., Jo, I.: Obfuscated malicious javascript detection using classification techniques. In: Proceedings of International Conference on Malicious and Unwanted Software (MALWARE), pp. 47–54 (October 2009)
Qassrawi, M., Zhang, H.: Detecting malicious web servers with honeyclients. Journal of Networks 6(1) (2011)
Dewald, A., Holz, T., Freiling, F.C.: Adsandbox: sandboxing javascript to fight malicious websites. In: ACM Symposium on Applied Computing, pp. 1859–1864 (2010)
Marco, C., Christopher, K., Giovanni, V.: Detection and analysis of drive-by-download attacks and malicious javascript code. In: Proceedings of WWW, pp. 281–290 (2010)
Alexander, M., Tanya, B., Damien, D., Gribble, S.D., Levy, H.M.: Spyproxy: execution-based detection of malicious web content. In: Proceedings of 16th USENIX Security Symposium, pp. 3:1–3:16 (2007)
Ford, S., Cova, M., Kruegel, C., Vigna, G.: Analyzing and detecting malicious flash advertisements. In: Proceedings of ACSAC (2009)
Ikinci, A., Holz, T., Freiling, F.: Monkey-spider: Detecting malicious websites with low-interaction honeyclients. In: Proceedings of Sicherheit, Schutz und Zuverlssigkeit, pp. 407–421 (2008)
Byung-Ik, K., Chae-Tae, I., Hyun-Chul, J.: Suspicious malicious web site detection with strength analysis of a javascript obfuscation. International Journal of Advanced Science and Technology, 19–32 (2011)
Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-by-download attacks. In: Proceedings ACSAC, pp. 31–39 (2010)
Kolbitsch, C., Livshits, B., Zorn, B., Seifer, C.: Rozzle: De-cloaking internet malware. Technical report, Microsoft (2011)
Google: Google safe browsing api (August 2011), http://code.google.com/apis/safebrowsing/
McAfee: Mcafee site advisor (July 2011), http://www.siteadvisor.com
Armorize.: mysql.com hacked:infecting visitors with malware (September 2011), http://blog.armorize.com/2011/09/mysqlcom-hacked-infecting-visitors-with.html
Egele, M., Kirda, E., Kruegel, C.: Mitigating drive-by download attacks: Challenges and open problems (2009)
Seo, D.: Facebook and twitter’s influence on google’s search rankings (May 2012), http://www.seomoz.org/blog/facebook-twitters-influence-google-search-rankings
HtmlUnit: Htmlunit (March 2012), http://htmlunit.sourceforge.net/
Facebook: Facebook graph api (March 2012), https://developers.facebook.com/docs/reference/api/
Twitter: Twitter url api (March 2012), http://urls.api.twitter.com/1/urls/
PhishTank: Phishtank developer information (September 2011), http://www.phishtank.com/developer_info.php
MalwareURL: Malware urls (September 2011), http://www.malwareurl.com/
Alexa: Alexa top 500 global websites (July 2011), http://www.alexa.com/topsites
Yahoo: Yahoo random url generator (October 2011), http://random.yahoo.com/bin/yrl/
DMOZ: Open directory project (September 2011), http://www.dmoz.org/
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explorations 11 (2009)
UCSB: Wepawet (July 2011), http://wepawet.cs.ucsb.edu
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Eshete, B., Villafiorita, A., Weldemariam, K. (2013). BINSPECT: Holistic Analysis and Detection of Malicious Web Pages. In: Keromytis, A.D., Di Pietro, R. (eds) Security and Privacy in Communication Networks. SecureComm 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36883-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-36883-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36882-0
Online ISBN: 978-3-642-36883-7
eBook Packages: Computer ScienceComputer Science (R0)