Validation on Machine Reading Comprehension Software without Annotated Labels: A Property-Based Method (ESEC/FSE 2021 - Research Papers)

Who

Songqiang Chen, Shuo Jin, Xiaoyuan Xie

Track

ESEC/FSE 2021 Research Papers

Time Zone

The program is currently displayed in (GMT+03:00) Athens.

Use conference time zone: (GMT+03:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 27 Aug 2021 11:00 - 11:10 - Testing—Testing of Machine Learning Models Chair(s): Chang Xu
Fri 27 Aug 2021 23:00 - 23:10 - Testing—Testing of Machine Learning Models Chair(s): Dan Hao

Abstract

Machine Reading Comprehension (MRC) in Natural Language Processing has seen great progress recently. But almost all the current MRC software is validated with a reference-based method, which requires well-annotated labels for test cases and tests the software by checking the consistency between the labels and the outputs. However, labeling test cases of MRC could be very costly due to their complexity, which makes reference-based validation hard to be extensible and sufficient. Furthermore, solely checking the consistency and measuring the overall score may not be sensible and flexible for assessing the language understanding capability. In this paper, we propose a property-based validation method for MRC software with Metamorphic Testing to supplement the reference-based validation. It does not refer to the labels and hence can make much data available for testing. Besides, it validates MRC software against various linguistic properties to give a specific and in-depth picture on linguistic capabilities of MRC software. Comprehensive experimental results show that our method can successfully reveal violations to the target linguistic properties without the labels. Moreover, it can reveal problems that have been concealed by the traditional validation. Comparison according to the properties provides deeper and more concrete ideas about different language understanding capabilities of the MRC software.

DOI

https://doi.org/10.1145/3468264.3468569

Songqiang Chen

Wuhan University

China

Shuo Jin

Wuhan University

China

Xiaoyuan Xie

Wuhan University

China

Time Zone

The program is currently displayed in (GMT+03:00) Athens.

Use conference time zone: (GMT+03:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 27 Aug
Displayed time zone: Athens change

11:00 - 12:00	Testing—Testing of Machine Learning ModelsResearch Papers / Journal First +12h Chair(s): Chang Xu Nanjing University

11:00 10m Paper		Validation on Machine Reading Comprehension Software without Annotated Labels: A Property-Based Method Research Papers Songqiang Chen Wuhan University, Shuo Jin Wuhan University, Xiaoyuan Xie Wuhan University DOI
11:10 10m Paper		FLEX: Fixing Flaky Tests in Machine Learning Projects by Updating Assertion Bounds Research Papers Saikat Dutta University of Illinois at Urbana-Champaign, August Shi University of Texas at Austin, Sasa Misailovic University of Illinois at Urbana-Champaign DOI
11:20 10m Paper		Practical Accuracy Estimation for Efficient Deep Neural Network Testing Journal First Junjie Chen Tianjin University, Zhuo Wu Tianjin International Engineering Institute, Tianjin University, Zan Wang Tianjin University, China, Hanmo You College of Intelligence and Computing, Tianjin University, Lingming Zhang University of Illinois at Urbana-Champaign, Ming Yan Tianjin University
11:30 30m Live Q&A		Q&A (Testing—Testing of Machine Learning Models) Research Papers

23:00 - 00:00	Testing—Testing of Machine Learning ModelsJournal First / Research Papers Chair(s): Dan Hao Peking University

23:00 10m Paper		Validation on Machine Reading Comprehension Software without Annotated Labels: A Property-Based Method Research Papers Songqiang Chen Wuhan University, Shuo Jin Wuhan University, Xiaoyuan Xie Wuhan University DOI
23:10 10m Paper		FLEX: Fixing Flaky Tests in Machine Learning Projects by Updating Assertion Bounds Research Papers Saikat Dutta University of Illinois at Urbana-Champaign, August Shi University of Texas at Austin, Sasa Misailovic University of Illinois at Urbana-Champaign DOI
23:20 10m Paper		Practical Accuracy Estimation for Efficient Deep Neural Network Testing Journal First Junjie Chen Tianjin University, Zhuo Wu Tianjin International Engineering Institute, Tianjin University, Zan Wang Tianjin University, China, Hanmo You College of Intelligence and Computing, Tianjin University, Lingming Zhang University of Illinois at Urbana-Champaign, Ming Yan Tianjin University
23:30 30m Live Q&A		Q&A (Testing—Testing of Machine Learning Models) Research Papers