Abstract
This paper studies the problem of integrating probabilistic uncertain information. Certain constraints are imposed by the semantics of integration, but there is no guarantee that they are satisfied in practical situations. We present a Bayesian-based approach to revise the probability distribution of the information in the sources in a systematic way to remedy this difficulty. The revision step is similar in spirit to tasks like data cleaning and record linkage and should be carried out before integration can be achieved for probabilistic uncertain data.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Abiteboul, S., Kanellakis, P.C., Grahne, G.: On the representation and querying of sets of possible worlds. In: Proceedings of ACM SIGMOD International Conference on Managementof Data, pp. 34–48 (1987)
Agrawal, P., Sarma, A.D., Ullman, J.D., Widom, J.: Foundations of uncertain-data integration. Proceedings of the VLDB Endowment 3(1), 1080–1090 (2010)
Antova, L., Jansen, T., Koch, C., Olteanu, D.: Fast and simple relational processing of uncertain data. In: Proceedings of IEEE International Conference on Data Engineering, pp. 983–992 (2008)
Antova, L., Koch, C., Olteanu, D.: 10\(^{\text{10 }^{\text{6 }}}\) worlds and beyond: Efficient representation and processing of incomplete information. In: Proceedingsof IEEE International Conference on Data Engineering, pp. 606–615 (2007)
Chen, D., Chirkova, R., Sadri, F., Salo, T.J.: Query optimization in information integration. Acta Informatica 50(4), 257–287 (2013)
Dalvi, N.N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Communications of the ACM 52(7), 86–94 (2009)
Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: The role of source dependence. PVLDB 2(1), 550–561 (2009)
Dong, X.L., Halevy, A., Yu, C.: Data integration with uncertainty. In: Proceedings of International Conference on Very Large Databases, pp. 687–698 (2007)
Dong, X.L., Halevy, A.Y., Yu, C.: Data integration with uncertainty. The VLDB Journal 18(2), 469–500 (2009)
Dong, X.L., Saha, B., Srivastava, D.: Less is more: Selecting sources wisely for integration. Proceedings of the VLDB Endowment 6(2), 37–48 (2012)
Eshmawi, A.A., Sadri, F.: Information integration with uncertainty. In: Proceedings of International Database Engineering and Applications, IDEAS, pp. 284–291 (2009)
Galland, A., Abiteboul, S., Marian, A., Senellart, P.: Corroborating information from disagreeing views. In: Proceedings of ACM InternationalConference on Web Search and Data Mining, pp. 131–140 (2010)
Haas, L.: Beauty and the Beast: The Theory and Practice of Information Integration. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 28–43. Springer, Heidelberg (2006)
Halevy, A.Y., Ashish, N., Bitton, D., Carey, M.J., Draper, D., Pollock, J., Rosenthal, A., Sikka, V.: Enterprise information integration: successes, challenges and controversies. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 778–787 (2005)
Halevy, A.Y., Rajaraman, A., Ordille, J.J.: Data integration: The teenage years. In: Proceedings of International Conference on Very Large Databases, pp. 9–16 (2006)
Jeffrey, R.: The Logic of Decision. McGraw-Hill (1965)
Magnani, M., Montesi, D.: Uncertainty in data integration: current approaches and open problems. In: Proceedings of VLDB Workshop on Managementof Uncertain Data, pp. 18–32 (2007)
Magnani, M., Montesi, D.: A survey on uncertainty management in data integration. ACM Journal of Data and Information Quality 2(1) (2010)
Olteanu, D., Huang, J., Koch, C.: SPROUT: Lazy vs. eager query plans for tuple-independent probabilistic databases. In: Proceedings of IEEE International Conference on Data Engineering, pp. 640–651 (2009)
Pochampally, R., Sarma, A.D., Dong, X.L., Meliou, A., Srivastava, D.:. Fusing data with correlations. In: Proceedings of ACM SIGMODInternational Conference on Management of Data, pp. 433–444 (2014)
Re, C., Dalvi, N.N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: Proceedings of IEEE International Conference on DataEngineering, pp. 886–895 (2007)
Sadri, F.: On the foundations of probabilistic information integration. In: Proceedings of International Conference on Information and Knowledge Management, pp. 882–891 (2012)
Sadri, F., Tallur, G: Integration of probabilistic uncertain information (2014) (manuscript)
Sarma, A.D., Benjelloun, O., Halevy, A.Y., Nabar, S.U., Widom, J.: Representing uncertain data: models, properties, and algorithms. The VLDB Journal 18(5), 989–1019 (2009)
Sarma, A.D., Benjelloun, O., Halevy, A.Y., Widom, J.: Working models for uncertain data. In: Proceedings of IEEE International Conferenceon Data Engineering, p. 7 (2006)
Sen, P., Deshpande, A.: Representing and querying correlated tuples in probabilistic databases. In: Proceedings of IEEE International Conference onData Engineering, pp. 596–605 (2007)
Shafer, G.: Jeffrey’s rule of conditioning. Philosophy of Science 48(3), 337–362 (1981)
Zhao, B., Rubinstein, B.I.P., Gemmell, J., Han, J.: A bayesian approach to discovering truth from conflicting sources for data integration. Proceedings of the VLDB Endowment 5(6), 550–561 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sadri, F. (2015). Belief Revision in Uncertain Data Integration. In: Sharaf, M., Cheema, M., Qi, J. (eds) Databases Theory and Applications. ADC 2015. Lecture Notes in Computer Science(), vol 9093. Springer, Cham. https://doi.org/10.1007/978-3-319-19548-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-19548-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19547-6
Online ISBN: 978-3-319-19548-3
eBook Packages: Computer ScienceComputer Science (R0)