Home > Events > Master Thesis Defense - Xueying Wang

Master Thesis Defense - Xueying Wang

Start: 3/18/2019 at 3:00PM
End: 3/18/2019 at 6:00PM
Location: 165 Fitzpatrick Hall
Add to calendar:
iCal vCal

Xueying Wang

Master Thesis Defense

March 18, 2019       3:00 pm        165 Fitzpatrick

Adviser:  Dr. Meng Jiang

Committee Members:

Dr. David Chiang      Dr. Taeho Jung        Dr. Tim Weninger

Title:

Improving Information Extraction via Truth Finding with Data-Driven Commonsense 

Abstract

The task of temporal slot filling (TSF) is to extract values of specific attributes for a given entity, called “facts”, as well as temporal tags of the facts, from text data. While existing works denote the temporal tags as single time slots, in this work, we introduce and study the task of Precise TSF (PTSF), that is to fill two precise temporal slots including the beginning and ending time points. Based on our observation from a news corpus, fewer than 0.1% have time expressions that contain the two points, and the articles' post time, though often available, is not as precise as the time expressions of being the time a fact was valid. Therefore, directly decomposing the time expressions or using an arbitrary post-time period cannot provide accurate results for PTSF. The challenge of PTSF lies in finding precise time tags in noisy and incomplete temporal contexts in the text. To address the challenge, we propose one unsupervised approach based on the philosophy of truth finding. The approach has two modules that mutually enhance each other: one is a reliability estimator of fact extractors conditionally on the temporal contexts; the other is a fact trustworthiness estimator based on the extractor's reliability. Commonsense knowledge (e.g., one country has only one president at a specific time) was automatically generated from data and used for inferring false claims based on trustworthy facts. For the purpose of evaluation, we manually collect hundreds of temporal facts from Wikipedia as ground truth, including country's presidential terms and sport team's player career history. Experiments on a large news dataset demonstrate the accuracy and efficiency of our proposed algorithm. Furthermore, we explore the possibility on another truth finding approach, probabilistic graphical model based method, to solve the same problem.