Understanding and Mitigation of Noise in Crowd-Sourced Relation Classification Dataset

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
Relation classification (RC), a task of classifying the relation between a given pair of entities in a sentence to a relation label is fundamental to IE systems. The identified structured triple (subject_entity, relation, object_entity) from the unstructured text can vastly help in knowledge base completion. This organized relational knowledge can further be used for other downstream tasks like question-answering, and common-sense reasoning. A large RC dataset TACRED has been widely used for benchmarking modern deep neural models. However, RC at a large scale is restricted mainly due to the presence of noise in the training dataset. Hence, the performance of such advanced deep neural models, which have shown excellent improvement on other NLP tasks, has been held back for RC.
Supervisors: Awekar, Amit and Anand, Ashish
Information Extraction, Relation Classification, Learning from Noisy Dataset