Abstract
Nowadays data mining plays an important role in decision making. Since many organizations do not possess the in-house expertise of data mining, it is beneficial to outsource data mining tasks to external service providers. However, most organizations hesitate to do so due to the concern of loss of business intelligence and customer privacy. In this paper, we present a Bloom filter based solution to enable organizations to outsource their tasks of mining association rules, at the same time, protect their business intelligence and customer privacy. Our approach can achieve high precision in data mining by trading-off the storage requirement.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, pp 247–255
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of database pp 207–216
Agrawal R, Kiernan J, Srikant R, Xu Y (2004) Order preserving encryption for numeric data. In: Proceedings of the ACM SIGMOD ICMD, pp 563–574
Agrawal R, Srikant R (1994) Faster algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB’94), Santiago de Chile, Chile, September 12–15, pp 487–499
Agrawal R, Srikant R (2000) Privacy preserving data mining. In: Proceedings of the 2000 ACM SIGMOD international conference on management of database, Texas, USA, May 16–18, pp 439–450
Agrawal S, Haritsa JR (2005) A framework for high-accuracy privacy-preserving mining. In: Proceedings of the 21th IEEE international conference on data engineering (ICDE 2005), Tokyo, Japan, pp 193–204
Apte C, Liu B, Pednault E and Smyth P (2002). Business applications of data mining. Commun ACM 45(8): 49–53
Atallah M, Bertino E, Elmagarmid AK, Ibrahim M, Verykios VS (1999) Disclosure limitation of sensitive rules. In: Proceedings of the IEEE KDEE, pp 45–52
Bishop M, Bhumiratana B, Crawford R, Levitt K (2004) How to sanitize data. In: Proceedings of the 13th IEEE international workshops on enabling technologies: infrastructure for collaborative enterprises (WETICE’04), Modena, Italy, June 14–16, pp 217–222
Bloom B (1970). Space time tradeoffs in hash coding with allowable errors. Commun ACM 7(13): 422–426
Dasseni E, Verykios VS, Elmagarmid AK, Bertino E (2001) Hiding association rules by using confidence and support. In: Proceedings of the 4th international information hiding workshop, pp 369–383
Dibbeern J, Heinzl A (2002) Outsourcing information systems in small and medium sized enterprises: a test of a multi-theoretical casaul model. In: Dibbeern J (ed) Information systems outsourcing: enduring themes, emergent patterns, and future directions. Springer, New York
Du W, Zhan Z (2002) Building decision tree classifier on private data. In: Proceedings of IEEE ICDM’02 workshop on privacy, security, and data mining, vol 14, pp 1–8
Evfimievski A, Gehrke J, Srikant R (2003) Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART symposium on principles of database system, pp 211–222
Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of the 8th ACM SIGKDD KDD 2002, pp 217–228
Hacigumus H, Iyer B, Li C, Mehrotra S (2002) Executing SQL over encrypted data in the database-service-provider model. In: Proceedings of the ACM SIGMOD international conference on management of database, pp 216–227
Hacigumus H, Iyer B, Mehrotra S (2002) Providing database as a service. In: Proceedings of the international conference on data engineering, pp 29–40
Hacigumus H, Iyer B, Mehrotra S (2004) Efficient execution of aggregation queries over encrypted relational databases. In: Proceedings of international conference on database systems for advanced applications, pp 125–136
Huang Z, Du W, Chen B (2005) Deriving private information from randomized data. In: Proceedings of the ACM SIGMOD international conference on management of data, Baltimore, MA, USA, June 14–16, pp 37–48
Iyer B, Mehrotra S, Mykletun E, Tsudik G, Wu Y (2004) A framework for efficient storage security in RDBMS. In: Proceedings of international conference on EDBT, pp 147–164
Kantarcıǒlu M, Clifton C (2002) Privacy preserving distributed mining of association rules on horizontally partitioned data. In: Proceedings of the ACM SIGMOD workshop on research issues on data mining and knowledge discovery, pp 24–31
Kantarcıǒlu M, Jin J, Clifton C (2004) When do data mining results violate privacy? In: Proceedings of the 10th ACM SIGKDD KDD 2004, pp 599–604
Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd IEEE ICDM, pp 99–106
Kargupta H, Datta S, Wang Q and Sivakumar K (2005). Random-data perturbation techniques and privacy-preserving data mining. Knowledge Inf Syst Int J 7(4): 387–414
Lin Q-Y, Chen Y-L, Chen J-S and Chen Y-C (2003). Mining inter-organizational retailing knowledge for an alliance formed by competitive firms. Inf Manage 40(5): 431–442
Lindell Y and Pinkas B (2002). Privacy preserving data mining. J Cryptol 15(3): 177–206
Lui SM, Qiu L (2007) Individual privacy and organizational privacy in business analytics. In: Proceedings of the 40th Hawaii international conference on system sciences (HICSS 2007), Hawaii, USA, January 3–6, p 216b
Milne G-R (2000). Privacy and ethical issues in database/interactive marketing and public policy: a research framework and overview of the special issue. J Public Policy Marketing 19: 1–6
Oliveira S, Zaiane O (2002) Privacy preserving frequent itemset mining. In: Proceedings of the IEEE ICDM workshop on privacy, security and data mining, pp 43–54
Oliveira S, Zaiane O (2003) Algorithms for balancing privacy and knowledge discovery in association rule mining. In: Proceedings of the 7th international database engineering and applications symposium, pp 54–63
Oliveira S, Zaiane O (2003) Protecting sensitive knowledge by data sanitization. In: Proceedings of the 3rd IEEE ICDM, pp 211–218
Ordones C, Ezquerra N and Santana CA (2006). Constraining and summarizing association rules in medical data. Knowledge Inf Syst Int J 9(3): 259–283
Pinkas B (2002). Cryptographic techniques for privacy preserving data mining. ACM SIGKDD Explor 4(2): 12–19
Qiu L, Li Y, Wu X (2006) An approach to outsourcing data mining tasks while protecting business intelligence and customer privacy. In: Workshops proceedings of the 6th IEEE international conference on data mining (ICDM 2006), Hong Kong, China, December 18–22, pp 551–558
Raś ZW, Gürdal O, Im S, Tzacheva A (2007) Data confidentiality versus chase. In: Proceedings of the joint rough sets symposium (JRS07), Toronto, Canada, May 14–16. Springer LNAI vol 4482, pp 330–337
Rizvi S, Haritsa J (2002) Maintaining data privacy in association rule mining. In: Proceedings of VLDB’02, pp 682–693
Saygin Y, Verykios VS and Clifton C (2001). Using unknowns to prevent discovery of association rules. Sigmod Rec 30(4): 45–54
Vaidya J and Clifton C (2004). Privacy-preserving data mining: why, how and when. IEEE Security Privacy 2(6): 19–27
Xu S, Zhang J, Han D and Wang J (2006). A singular value decomposition based data distortion strategy for privacy protection. Knowledge Inf Syst Int J 10(3): 383–397
Yao AC-C (1986) How to generate and exchange secrets. In: Proceedings of the 27th IEEE symposium on foundations of computer science (FOCS’86), Xi’an, China, pp 162–167
Zheng Z, Kohavi R, Mason L (2001) Real world performance of association rule algorithms. In: Proceedings of the 7th ACM-SIGKDD international conference on knowledge discovery and data mining, pp 401–406
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by the USA National Science Foundation Grants CCR-0310974 and IIS-0546027.
Rights and permissions
About this article
Cite this article
Qiu, L., Li, Y. & Wu, X. Protecting business intelligence and customer privacy while outsourcing data mining tasks. Knowl Inf Syst 17, 99–120 (2008). https://doi.org/10.1007/s10115-007-0113-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-007-0113-3