Coreference Reasoning in Machine Reading Comprehension

This post, leveraging AI, summarizes and analyzes the key aspects of the research paper “Coreference Reasoning in Machine Reading Comprehension”. For in-depth information, please refer to the original PDF.

📄 Original PDF: Download / View Fullscreen

English Summary

Machine reading comprehension (MRC), also known as machine reading ability, is a crucial skill that enables machines to answer questions based on passages. However, Quoref datasets do not necessarily reflect the challenges of coreference reasoning for MRC models. In this paper, we propose two solutions to address this issue: creating MRC datasets that better reﬂect the challenges of coreference reasoning and using existing coreference resolution datasets for training machines in reading comprehension tasks. Our approach helps overcome artifacts present in Quoref datasets which can affect machine performance on QA pairs.

Key Technical Terms

Below are key technical terms and their explanations to help understand the core concepts of this paper. You can explore related external resources via the links next to each term.

Machine Reading Comprehension (MRC) [Wikipedia (Ko)] [Wikipedia (En)] [나무위키] [Google Scholar] [Nature] [ScienceDirect] [PubMed]
Explanation: The ability of machines to read and understand passages, answering questions based on the text content.
Coreference Resolution Datasets [Wikipedia (Ko)] [Wikipedia (En)] [나무위키] [Google Scholar] [Nature] [ScienceDirect] [PubMed]
Explanation: Collections that enable machines to identify different expressions referring to the same entity in a passage. These datasets are crucial for creating MRC datasets that better reﬂect coreference reasoning challenges.
Quoref Dataset [Wikipedia (Ko)] [Wikipedia (En)] [나무위키] [Google Scholar] [Nature] [ScienceDirect] [PubMed]
Explanation: A dataset created by annotators using existing coreference resolution techniques, focusing on answering questions without performing coreference reasoning.

View Original Excerpt (English)

Coreference Reasoning in Machine Reading Comprehension Mingzhu Wu1, Naﬁse Sadat Moosavi1, Dan Roth2, Iryna Gurevych1 1UKP Lab, Technische Universit¨at Darmstadt 2Department of Computer and Information Science, UPenn 1https://www.ukp.tu-darmstadt.de 2 https://www.seas.upenn.edu/˜danroth/ Abstract improves the performance on some coreference- related datasets (Wu et al., 2020b; Aralikatte et al., Coreference resolution is essential for natu- 2019). There are also various datasets for the task ral language understanding and has been long of reading comprehension on which the model studied in NLP. In recent years, as the for- requires to perform coreference reasoning to an- mat of Question Answering (QA) became a swer some of the questions, e.g., DROP (Dua2021 standard for machine reading comprehension (MRC), there have been data collection efforts, et al., 2019), DuoRC (Saha et al., 2018), MultiRC e.g., Dasigi et al. (2019), that attempt to evalu- (Khashabi et al., 2018), etc.Jun ate the ability of MRC models to reason about Quoref (Dasigi et al., 2019) is a dataset that is 9 coreference. However, as we show, coref- particularly designed for evaluating coreference erence reasoning in MRC is a greater chal- understanding of MRC models. Figure 1 shows a lenge than earlier thought; MRC datasets do QA sample from Quoref in which the model needs not reﬂect the natural distribution and, conse- to resolve the coreference relation between “his” quently, the challenges of coreference reason- ing. Speciﬁcally, success on these datasets and “John Motteux” to answer the question.[cs.CL] does not reﬂect a model’s proﬁciency in coref- erence reasoning. We propose a methodol- ogy for creating MRC datasets that better re- ﬂect the challenges of coreference reasoning and use it to create a sample evaluation set. The results on our dataset show that state-of- the-art models still struggle with these phe- nomena. Furthermore, we develop an effec- Figure 1: A sample from…

🇰🇷 한국어 보기 (View in Korean)

한글 요약 (Korean Summary)

기계 판독 능력으로도 알려진 기계 판독 이해 (MRC)는 기계가 구절에 따라 질문에 답변 할 수있는 중요한 기술입니다. 그러나 Quoref 데이터 세트가 반드시 MRC 모델에 대한 코퍼레이션 추론의 문제를 반영하지는 않습니다. 이 논문에서는이 문제를 해결하기위한 두 가지 솔루션을 제안합니다. Correference 추론의 문제를 더 잘 반영하는 MRC 데이터 세트를 작성하고 이해 작업을 읽는 데있어 훈련 기계에 기존 Correference 해상도 데이터 세트를 사용합니다. 우리의 접근 방식은 QA 쌍의 기계 성능에 영향을 줄 수있는 Quoref 데이터 세트에 존재하는 아티팩트를 극복하는 데 도움이됩니다.

주요 기술 용어 (한글 설명)

Machine Reading Comprehension (MRC)
설명 (Korean): 기계가 구절을 읽고 이해하는 능력, 텍스트 내용에 따라 질문에 대답합니다.
(Original English: The ability of machines to read and understand passages, answering questions based on the text content.)
Coreference Resolution Datasets
설명 (Korean): 기계가 구절에서 동일한 엔티티를 언급하는 다른 표현식을 식별 할 수있는 컬렉션. 이러한 데이터 세트는 Correference 추론 문제를 더 잘 반영하는 MRC 데이터 세트를 작성하는 데 중요합니다.
(Original English: Collections that enable machines to identify different expressions referring to the same entity in a passage. These datasets are crucial for creating MRC datasets that better reﬂect coreference reasoning challenges.)
Quoref Dataset
설명 (Korean): 기존 Coreference 해상도 기술을 사용하여 주석기가 만든 데이터 세트는 Coreference 추론을 수행하지 않고 질문에 대답하는 데 중점을 둡니다.
(Original English: A dataset created by annotators using existing coreference resolution techniques, focusing on answering questions without performing coreference reasoning.)

발췌문 한글 번역 (Korean Translation of Excerpt)

기계 읽기 이해 mingzhu wu1, nafe sadat moosavi1, dan roth2, iryna gurevych1 1ukp lab, technische Universit at darmstadt 2department, 컴퓨터 및 정보 과학, Upenn 1https : //www.ukp.tu-darmstadt.de 2 https://www.seas.upenn.edu/ ~ danroth/ Abstract는 일부 심판 관련 데이터 세트의 성능을 향상시킵니다 (Wu et al., 2020b; Aralikatte et al., Correference 해상도는 Natu- 2019에 필수적입니다). 작업 RAL 언어 이해를위한 다양한 데이터 세트도 있으며 NLP에서 연구 된 모델이 이해 된 독해력이 오래되었습니다. In recent years, as the for- requires to perform coreference reasoning to an- mat of Question Answering (QA) became a swer some of the questions, e.g., DROP (Dua2021 standard for machine reading comprehension (MRC), there have been data collection efforts, et al., 2019), DuoRC (Saha et al., 2018), MultiRC e.g., Dasigi et al. (2019), 그 시도 (Khashabi et al., 2018) 등. 그러나 우리가 보여 주듯이, Coref- 특히 MRC에서 Coreference Erence 추론을 평가하도록 설계된 Coref-는 MRC 모델에 대한 더 큰 이해입니다. 그림 1은 이전의 생각보다 lenge를 보여줍니다. MRC 데이터 세트는 Quoref의 QA 샘플을 수행하여 모델이 자연 분포를 반영 할 필요가 없으며, “그의”Quely, Correference Resource의 도전 과제 사이의 핵심 관계를 해결합니다. 구체적으로,이 데이터 세트의 성공과 “John Motteux”의 성공은 질문에 답변합니다. [cs.cl]은 핵심 추론에서 모델의 전문성을 반영하지 않습니다. 우리는 Correference 추론의 문제를 더 잘 색치고이를 사용하여 샘플 평가 세트를 생성하는 MRC 데이터 세트를 작성하는 방법을 제안합니다. 우리의 데이터 세트의 결과는 최첨단 모델이 여전히 이러한 현상과 어려움을 겪고 있음을 보여줍니다. 또한, 우리는 effec- 그림 1 : 샘플을 개발합니다.

Source: arXiv.org (or the original source of the paper)

ilikeafrica.com

Coreference Reasoning in Machine Reading Comprehension

English Summary

Key Technical Terms

한글 요약 (Korean Summary)

주요 기술 용어 (한글 설명)

발췌문 한글 번역 (Korean Translation of Excerpt)

이것이 좋아요:

Comments

답글 남기기 응답 취소

Coreference Reasoning in Machine Reading Comprehension

English Summary

Key Technical Terms

한글 요약 (Korean Summary)

주요 기술 용어 (한글 설명)

발췌문 한글 번역 (Korean Translation of Excerpt)

이 글 공유하기:

이것이 좋아요:

Comments

답글 남기기 응답 취소