Thesis / Dissertation

Statistical Approaches for Entity Resolution under Uncertainty

Neil Grant Marchant, Benjamin Rubinstein (ed.)

Published : 2020


When real-world entities are referenced in data, their identities are often obscured. This presents a problem for data cleaning and integration, as references to an entity may be scattered across multiple records or sources, without a means to identify and consolidate them. Entity resolution (ER; also known as record linkage and deduplication) seeks to address this problem by linking references to the same entity, based on imprecise information. It has diverse applications: from resolving references to individuals in administrative data for public health research, to resolving product listings on the web for a shopping aggregation service. While many methods have been developed to automate t..

View full abstract