Abstract
Today's privacy policies contain various deficiencies, including failure to convey information comprehensibly to most Internet users and a lack of transparency. Meanwhile, existing studies on privacy policies only focused on specific areas of interest and lack an inclusive outlook on the state privacy policies due to the differences in privacy policy samples, text properties, measures, methodologies, and backgrounds. Therefore, this research develops an assessment metric to bridge this gap by integrating the fragmented understanding of privacy policies and exploring potential aspects to evaluate privacy policies absent from existing studies. The multifaceted assessment metric developed through this study covers three main aspects: content, text property, and user interface. Through the investigation and analyses performed on Malaysian organizations’ online privacy policies, this study reveals several trends using text processing and clustering analysis methods: (1) the use of jargon in privacy policies are relatively low, (2) privacy policies with higher compliance levels tend to be lengthier and more repetitive, and vice versa, (3) regardless of compliance level, there are privacy policies that are not presented in user-friendly font size. Finally, as an experiment of applying the developed metrics, the results confirm the relevance of the assessment metrics developed for assessing online privacy policies via text processing and clustering analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mutimukwe C, Kolkowska E, Grönlund Å Information privacy in e-service: effect of organizational privacy assurances on individual privacy concerns, perceptions, trust and self-disclosure behaviour. Gov Inf Quart 37(1):101413
Chua HN, Wong SF, Low YC, Chang Y Impact of employees’ demographic characteristics on the awareness and compliance of information security policy in organizations. Telematics Inform 35(6):1770–1780
Kaur J, Dara RA, Obimbo C, Song F, Menard K A comprehensive keyword analysis of online privacy policies. Inf Secur J: Glob Perspect 27(5–6):260–275
Waldman AE (2018) Privacy, notice, and design. Stanford Technol Law Rev 21(1):74–127
Guntamukkala N, Dara R, Grewal G (2015) A machine-learning based approach for measuring the completeness of online privacy policies. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), pp 289–294
Reidenberg J, Bhatia J, Breaux T, Norton T (2016) Ambiguity in privacy policies and the impact of regulation. J Leg Stud 45:S163–S190
Meier Y, Schäwel J, Krämer NC (2020) The shorter the better? Effects of privacy policy length on online privacy decision-making. Media Commun 8(2):291–301
Chua HN, Ooi JS, Herbland A (2021) The effects of different personal data categories on information privacy concern and disclosure. Comput Secur 110:102453
Gao L, Brink AG (2018) A content analysis of the privacy policies of cloud computing services. J Inf Syst 33(3):93–115
Liao S, Wilson C, Cheng L, Hu H, Deng H (2020) Measuring the effectiveness of privacy policies for voice assistant applications. arXiv preprint arXiv:2007.14570
Chua HN, Herbland A, Wong SF, Chang Y (2017) Compliance to personal data protection principles: a study of how organizations frame privacy policy notices. Telematics Inform 34(4):157–170
Ermakova T, Baumann A, Fabian B, Krasnova H (2014) Privacy policies and users' trust: does readability matter? In: AMCIS
Xu H, Dinev T, Smith H, Hart P (2008) Examining the formation of individual's privacy concerns: toward an integrative view, p 6
Pavlou PA (2003) Consumer acceptance of electronic commerce: integrating trust and risk with the technology acceptance model. Int J Electron Commer 7(3):101–134
Al-Jabri Ibrahim M, Eid Mustafa I, Abed A (2019) The willingness to disclose personal information: trade-off between privacy concerns and benefits. Inf Comput Secur 28(2):161–181
Raschke R, Krishen A, Kachroo P (2014) Understanding the components of information privacy threats for location-based services. J Inf Syst 28:227–242
Young J (2020) Metrics. https://www.investopedia.com/terms/m/metrics.asp (accessed 2020/11/15, 2020)
Flesch R (1949) The art of readable writing. Harper and Row, New York, NY
PDPA (2010) Laws of Malaysia, Act 709, Personal Data Protection Act 2010
Lauer TW, Deng X (2007) Building online trust through privacy practices. Int J Inf Secur 6(5):323–331, 2007/09/01 2007
Wu K-W, Huang SY, Yen DC, Popova I (2012) The effect of online privacy policy on consumer privacy concern and trust. Comput Hum Behav 28(3):889–897
Acquisti A, Adjerid I, Brandimarte L (2013) Gone in 15 seconds: the limits of privacy transparency and control. IEEE Secur Priv 11(4):72–74
Zimmeck S, Bellovin SM (2014) Privee: an architecture for automatically analyzing web privacy policies. In: 23rd Security symposium ({USENIX} Security 14), pp 1–16
Wilson S et al The creation and analysis of a website privacy policy corpus. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1330–1340
Li Y, Stweart W, Zhu J, Ni A (2012) Online privacy policy of the thirty Dow Jones corporations: compliance with FTC fair information practice principles and readability assessment. Commun IIMA 12:5
Chaw CY, Chua HN (2021) A framework system using word mover’s distance text similarity algorithm for assessing privacy policy compliance. In: IT convergence and security. Springer, Singapore, pp 79–89
Paasche-Orlow MK, Jacob DM, Powell JN (2005) Notices of privacy practices: a survey of the health insurance portability and accountability act of 1996 documents presented to patients at US hospitals. Med Care 43(6):558–564
Milne G, Culnan M, Greene H (2006) A longitudinal assessment of online privacy notice readability. J Public Policy Market 25:238–249
Vail MW, Earp JB, AntÓn AI (2008) An empirical study of consumer perceptions and comprehension of web site privacy policies. IEEE Trans Eng Manage 55(3):442–454
Singh RI, Sumeeth M, Miller J (2011) A user-centric evaluation of the readability of privacy policies in popular web sites. Inf Syst Front 13(4):501–514, 2011/09/01 2011. https://doi.org/10.1007/s10796-010-9228-2
Shulman HC, Dixon GN, Bullock OM, Colón Amill D (2020) The effects of Jargon on processing fluency, self-perceptions, and scientific engagement. J Lang Soc Psychol, 39(5–6):579–597
Al-Saqer NS, Seliaman ME (2016) The impact of privacy concerns and perceived vulnerability to risks on users privacy protection behaviors on SNS: a structural equation model. Int J Adv Comput Sci Appl 7
Hu M, Nation P (2000) Unknown vocabulary density and reading comprehension. Read Foreign Language 13:403–430
Laufer B, Ravenhorst-Kalovski GC (2010) Lexical threshold revisited: lexical text coverage, learners’ vocabulary size and reading comprehension. Read Foreign Language 22:15–30
Kon G (2018) Does anyone read privacy notices? The facts. In: Linklaters (ed)
Grannis A (2015) You didn’t even notice! Elements of effective online privacy policies. Fordham Urban Law J 42:1109
Obar J, Oeldorf-Hirsch A (2018) The biggest lie on the Internet: ignoring the privacy policies and terms of service policies of social networking services. Inf, Commun Soc 23:1–20, 07/03 2018. https://doi.org/10.1080/1369118X.2018.1486870
Goel S, Chengalur-Smith IN (2010) Metrics for characterizing the form of security policies. J Strateg Inf Syst 19(4):281–295
Hwang MI, Lin JW (1999) Information dimension, information overload and decision quality. J Inf Sci 25(3):213–218
Jacoby J (1984) Perspectives on information overload. J Consum Res 10(4):432–435
Edmunds A, Morris A (2000) The problem of information overload in business organisations: a review of the literature. Int J Inf Manage 20(1):17–28
Rello L, Pielot M, Marcos M-C (2016) Make it big! The effect of font size and line spacing on online readability. In: Proceedings of the 2016 CHI conference on human factors in computing systems, San Jose, California, USA
Banerjee J, Bhattacharyya M (2011) Selection of the optimum font type and size interface for on screen continuous reading by young adults: an ergonomic approach. J Hum Ergol 40:47–62, 12/01 2011
Darroch I, Goodman J, Brewster SA, Gray PDG (2005) The effect of age and font size on reading text on handheld computers. Lect Notes Comput Sci 3585:253–266
Evett L, Brown D (2005) Text formats and web design for visually impaired and dyslexic readers-clear text for all. Interact Comput 17(4):453–472
O’Brien BA, Mansfield JS, Legge GE (2005) The effect of print size on reading speed in dyslexia (in Eng). J Res Read 28(3):332–349
Rello L, Pielot M, Marcos M-C, Carlini R (2013) Size matters (spacing not): 18 points for a dyslexic-friendly Wikipedia
Power C, Petrie H, Swallow D, Murphy E, Gallagher B, Velasco CA (2013) Navigating, discovering and exploring the web: strategies used by people with print disabilities on interactive websites. Berlin, Heidelberg, 2013: Springer Berlin Heidelberg, in Human-Computer Interaction—INTERACT 2013, pp 667–684
Hojjati N, Muniandy B (2014) The effects of font type and spacing of text for online readability and performance. Contemp Educ Technol 5, 06/01 2014
PDPA (2013) Personal data protection (Class of Data Users) Order 2013. Federal Government Gazette
Bell SM, Miller KC, McCallum RS, Hopkins M, Hilton-Prillhart A (2012) Unique screener of reading fluency and comprehension for adolescents and adults. Psychology 3(1):45
Benevides T, Peterson SS (2010) Literacy attitudes, habits and achievements of future teachers. J Educ Teach 36(3):291–302
Masterson J, Hayes M (2004) UK data from 197 undergraduates for the Nelson Denny reading test. J Res Read 27(1):30–35
Rakedzon T, Segev E, Chapnik N, Yosef R, Baram-Tsabari A (2017) Automatic jargon identifier for scientists engaging with the public and science communication educators. PLoS ONE 12(8):e0181742
Franken G, Podlesek A, Mozina K (2015) Eye-tracking study of reading speed from LCD displays: influence of type style and type size. J Eye Mov Res 8
Wallace S, Treitman R, Huang J, Sawyer BD, Bylinskii Z (2020) Accelerating adult readers with typeface: a study of individual preferences and effectiveness. In: 2020 CHI conference on human factors in computing systems, Honolulu, HI, USA
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Low, S.A., Chua, H.N. (2023). Multifaceted Metrics for Assessing Privacy Policies Using Text Processing and Clustering Analysis. In: Kumar, S., Hiranwal, S., Purohit, S.D., Prasad, M. (eds) Proceedings of International Conference on Communication and Computational Technologies . Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-3951-8_19
Download citation
DOI: https://doi.org/10.1007/978-981-19-3951-8_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-3950-1
Online ISBN: 978-981-19-3951-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)