Abstract
Revelations of large scale electronic surveillance and data mining by governments and corporations have fueled increased adoption of HTTPS. We present a traffic analysis attack against over 6000 webpages spanning the HTTPS deployments of 10 widely used, industry-leading websites in areas such as healthcare, finance, legal services and streaming video. Our attack identifies individual pages in the same website with 90% accuracy, exposing personal details including medical conditions, financial and legal affairs and sexual orientation. We examine evaluation methodology and reveal accuracy variations as large as 17% caused by assumptions affecting caching and cookies. We present a novel defense reducing attack accuracy to 25% with a 9% traffic increase, and demonstrate significantly increased effectiveness of prior defenses in our evaluation context, inclusive of enabled caching, user-specific cookies and pages within the same website.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Hintz, A.: Fingerprinting Websites Using Traffic Analysis. In: Dingledine, R., Syverson, P.F. (eds.) PET 2002. LNCS, vol. 2482, pp. 171–178. Springer, Heidelberg (2003)
Sun, Q., Simon, D.R., Wang, Y.-M., Russell, W., Padmanabhan, V.N., Qiu, L.: Statistical Identification of Encrypted Web Browsing Traffic. In: Proc. IEEE S&P (2002)
Herrmann, D., Wendolsky, R., Federrath, H.: Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Naive-Bayes Classifier. In: Proc. of ACM CCSW (2009)
Cai, X., Zhang, X.C., Joshi, B., Johnson, R.: Touching From a Distance: Website Fingerprinting Attacks and Defenses. In: Proc. of ACM CCS (2012)
Dyer, K.P., Coull, S.E., Ristenpart, T., Shrimpton, T.: Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail. In: IEEE S&P (2012)
Liberatore, M., Levine, B.N.: Inferring the Source of Encrypted HTTP Connections. In: Proc. ACM CCS (2006)
Bissias, G.D., Liberatore, M., Jensen, D., Levine, B.N.: Privacy Vulnerabilities in Encrypted HTTP Streams. In: Danezis, G., Martin, D. (eds.) PET 2005. LNCS, vol. 3856, pp. 1–11. Springer, Heidelberg (2006)
Panchenko, A., Niessen, L., Zinnen, A., Engel, T.: Website Fingerprinting in Onion Routing Based Anonymization Networks. In: Proc. ACM WPES (2011)
Wang, T., Goldberg, I.: Improved Website Fingerprinting on Tor. In: Proc. of ACM WPES 2013 (2013)
Cheng, H., Avnur, R.: Traffic Analysis of SSL Encrypted Web Browsing (1998), http://www.cs.berkeley.edu/~daw/teaching/cs261-f98/projects/final-reports/ronathan-heyning.ps
Danezis, G.: Traffic Analysis of the HTTP Protocol over TLS, http://research.microsoft.com/en-us/um/people/gdane/papers/TLSanon.pdf
Chen, S., Wang, R., Wang, X., Zhang, K.: Side-Channel Leaks in Web Applications: A Reality Today, a Challenge Tomorrow. In: Proc. IEEE S&P (2010)
Coull, S.E., Collins, M.P., Wright, C.V., Monrose, F., Reiter, M.K.: On Web Browsing Privacy in Anonymized NetFlows. In: Proc. USENIX Security (2007)
Luo, X., Zhou, P., Chan, E.W.W., Lee, W., Chang, R.K.C., Perdisci, R.: HTTPOS: Sealing Information Leaks with Browser-side Obfuscation of Encrypted Flows. In: Proc. of NDSS (2011)
Wright, C.V., Coull, S.E., Monrose, F.: Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis. In: NDSS (2009)
To infinity and beyond? No! http://googlewebmastercentral.blogspot.com/2008/08/to-infinity-and-beyond-no.html
Miller, B., Huang, L., Joseph, A.D., Tygar, J.D.: I Know Why You Went to the Clinic: Risks and Realization of HTTPS Traffic Analysis (2014), http://arxiv.org/abs/1403.0297
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: A Library for Large Linear Classification. JMLR (9), 1871–1874 (2008)
Torbutton FAQ, https://www.torproject.org/torbutton/torbutton-options.html.en (accessed May 2012)
Scikit-learn, http://scikit-learn.org/stable/
Chang, C.-C., Lin, C.-J.: LIBSVM: A Library for Support Vector Machines. ACM Transactions on TIST 2(3) (2011)
Sofia-ml, http://code.google.com/p/sofia-ml/
Numpy, http://www.numpy.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Miller, B., Huang, L., Joseph, A.D., Tygar, J.D. (2014). I Know Why You Went to the Clinic: Risks and Realization of HTTPS Traffic Analysis. In: De Cristofaro, E., Murdoch, S.J. (eds) Privacy Enhancing Technologies. PETS 2014. Lecture Notes in Computer Science, vol 8555. Springer, Cham. https://doi.org/10.1007/978-3-319-08506-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-08506-7_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08505-0
Online ISBN: 978-3-319-08506-7
eBook Packages: Computer ScienceComputer Science (R0)