Abstract
In this paper we look into the use of crowdsourcing as a means to handle Linked Data quality problems that are challenging to be solved automatically. We analyzed the most common errors encountered in Linked Data sources and classified them according to the extent to which they are likely to be amenable to a specific form of crowdsourcing. Based on this analysis, we implemented a quality assessment methodology for Linked Data that leverages the wisdom of the crowds in different ways: (i) a contest targeting an expert crowd of researchers and Linked Data enthusiasts; complemented by (ii) paid microtasks published on Amazon Mechanical Turk.We empirically evaluated how this methodology could efficiently spot quality issues in DBpedia. We also investigated how the contributions of the two types of crowds could be optimally integrated into Linked Data curation processes. The results show that the two styles of crowdsourcing are complementary and that crowdsourcing-enabled quality assessment is a promising and affordable way to enhance the quality of Linked Data.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bernstein, M.S., Little, G., Miller, R.C., Hartmann, B., Ackerman, M.S., Karger, D.R., Crowell, D., Panovich, K.: Soylent: a word processor with a crowd inside. In: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, UIST 2010, pp. 313–322. ACM, New York (2010), http://doi.acm.org/10.1145/1866029.1866078 , doi:10.1145/1866029.1866078
Bizer, C., Cyganiak, R.: Quality-driven information filtering using the wiqa policy framework. Web Semantics 7(1), 1–10 (2009)
Cafarella, M.J., Halevy, A.Y., Wang, D.Z., Wu, E., Zhang, Y.: Webtables: exploring the power of tables on the web. PVLDB 1(1), 538–549 (2008)
Demartini, G., Difallah, D., Cudré-Mauroux, P.: Zencrowd: Leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: 21st International Conference on World Wide Web WWW 2012, pp. 469–478 (2012)
Flemming, A.: Quality characteristics of linked data publishing datasources. Master’s thesis, Humboldt-Universität of Berlin (2010)
Guéret, C., Groth, P., Stadler, C., Lehmann, J.: Assessing linked data mappings using network measures. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 87–102. Springer, Heidelberg (2012)
Hogan, A., Harth, A., Passant, A., Decker, S., Polleres, A.: Weaving the pedantic web. In: LDOW (2010)
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker, S.: An empirical survey of linked data conformance. Journal of Web Semantics 14, 14–44 (2012)
Lehmann, J., Bizer, C., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - a crystallization point for the web of data. Journal of Web Semantics 7(3), 154–165 (2009)
Markotschi, T., Völker, J.: Guesswhat?! - human intelligence for mining linked data. In: Proceedings of the Workshop on Knowledge Injection into and Extraction from Linked Data at EKAW (2010)
Mendes, B.C., Mühleisen, P.N., Sieve, H.: Linked data quality assessment and fusion. In: LWDM (2012)
Sarasua, C., Simperl, E., Noy, N.F.: crowdMap: Crowdsourcing ontology alignment with microtasks. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 525–541. Springer, Heidelberg (2012)
Thaler, S., Siorpaes, K., Simperl, E.: Spotthelink: A game for ontology alignment. In: Proceedings of the 6th Conference for Professional Knowledge Management (2011)
Wang, J., Kraska, T., Franklin, M.J., Feng, J.: Crowder: crowdsourcing entity resolution. Proc. VLDB Endow. 5, 1483–1494 (2012)
Zaveri, A., Kontokostas, D., Sherif, M.A., Bühmann, L., Morsey, M., Auer, S., Lehmann, J.: User-driven quality evaluation of dbpedia. In: Proceedings of 9th International Conference on Semantic Systems, I-SEMANTICS 2013, September 4–6. ACM, Graz (2013)
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment methodologies for linked open data. Under Review, http://www.semantic-web-journal.net/content/quality-assessment-methodologies-linked-open-data
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Acosta, M., Zaveri, A., Simperl, E., Kontokostas, D., Auer, S., Lehmann, J. (2013). Crowdsourcing Linked Data Quality Assessment. In: Alani, H., et al. The Semantic Web – ISWC 2013. ISWC 2013. Lecture Notes in Computer Science, vol 8219. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41338-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-41338-4_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41337-7
Online ISBN: 978-3-642-41338-4
eBook Packages: Computer ScienceComputer Science (R0)