Blocking

Christophides, Vassilis; Efthymiou, Vasilis; Stefanidis, Kostas

doi:10.1007/978-3-031-79468-1_3

Vassilis Christophides^4,5,
Vasilis Efthymiou^4,6 &
Kostas Stefanidis⁶

Part of the book series: Synthesis Lectures on Data, Semantics, and Knowledge ((SLDSK))

102 Accesses
1 Citations

Abstract

As we have seen in Chapter 2.1, grouping entity descriptions in blocks before comparing them for matching is an important pre-processing step for pruning the quadratic number of comparisons required to resolve a collection of entity descriptions. The main objective of algorithms for entity blocking, formally defined in Section 3.1, is to achieve a reasonable compromise between the number of comparisons suggested and the number of missed entity matches. In Section 3.2, we briefly present traditional blocking algorithms proposed for relational records and explain why they cannot be used in the Web of data. Then, in Section 3.3, we detail a family of algorithms that relies on a simple inverted index of entity descriptions extracted from the tokens of their attribute values. Hence, two descriptions are placed into the same block if they share at least a common token. As we will see in Section 3.4, a more precise similarity (e.g., Jaccard) comparison of two entity descriptions can be achieved by post-processing the blocks of the inverted index and thus further reduce the number of entity pairs that need to be compared.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 29.99; Price excludes VAT (USA)

Softcover Book: USD 37.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

University of Crete, Greece
Vassilis Christophides & Vasilis Efthymiou
INRIA, France
Vassilis Christophides
ICS-FORTH, Greece
Vasilis Efthymiou & Kostas Stefanidis

Authors

Vassilis Christophides
View author publications
You can also search for this author in PubMed Google Scholar
Vasilis Efthymiou
View author publications
You can also search for this author in PubMed Google Scholar
Kostas Stefanidis
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Christophides, V., Efthymiou, V., Stefanidis, K. (2015). Blocking. In: Entity Resolution in the Web of Data. Synthesis Lectures on Data, Semantics, and Knowledge. Springer, Cham. https://doi.org/10.1007/978-3-031-79468-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-79468-1_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-79467-4
Online ISBN: 978-3-031-79468-1
eBook Packages: Synthesis Collection of Technology (R0)eBColl Synthesis Collection 6

Publish with us

Policies and ethics