SynGuar: Guaranteeing Generalization in Programming by Example
Wed 25 Aug 2021 20:10 - 20:20 - SE & AI—Machine Learning for Software Engineering 1 Chair(s): Kelly Lyons, Phuong T. Nguyen
Programming by Example (PBE) is a program synthesis paradigm in which the synthesizer creates a program that matches a set of given examples. In many applications of such synthesis (e.g., program repair or reverse engineering), we are to reconstruct a program that is close to a specific target program, not merely to produce some program that satisfies the seen examples. In such settings, we wish that the synthesized program {\em generalizes} well, i.e., has as few errors as possible on the unobserved examples capturing the target function behavior. In this paper, we propose the first framework (called {\sc SynGuar}) for PBE synthesizers that guarantees to achieve low generalization error with high probability. Our main contribution is a procedure to dynamically calculate how many additional examples suffice to theoretically guarantee generalization. We show how our techniques can be used in 2 well-known synthesis approaches: PROSE and STUN (synthesis through unification), for common string-manipulation program benchmarks. We find that often a few hundred examples suffice to provably bound generalization error below $5%$ with high ($\geq 98%$) probability on these benchmarks. Further, we confirm this empirically: {\sc SynGuar} significantly improves the accuracy of existing synthesizers in generating the right target programs. But with fewer examples chosen arbitrarily, the same baseline synthesizers (without {\sc SynGuar}) {\em overfit} and lose accuracy.
Wed 25 AugDisplayed time zone: Athens change
08:00 - 09:00 | SE & AI—Machine Learning for Software Engineering 1Research Papers +12h Chair(s): Michael Pradel University of Stuttgart, Ivica Crnkovic Chalmers University of Technology | ||
08:00 10mPaper | Boosting Coverage-Based Fault Localization via Graph-Based Representation Learning Research Papers Yiling Lou Purdue University, Qihao Zhu Peking University, Jinhao Dong Peking University, Xia Li Kennesaw State University, Zeyu Sun Peking University, Dan Hao Peking University, Lu Zhang Peking University, Lingming Zhang University of Illinois at Urbana-Champaign DOI | ||
08:10 10mPaper | SynGuar: Guaranteeing Generalization in Programming by Example Research Papers Bo Wang National University of Singapore, Teodora Baluta National University of Singapore, Aashish Kolluri National University of Singapore, Prateek Saxena National University of Singapore DOI | ||
08:20 10mPaper | StateFormer: Fine-Grained Type Recovery from Binaries using Generative State Modeling Research Papers Kexin Pei Columbia University, Jonas Guan University of Toronto, Matthew Broughton Columbia University, Zhongtian Chen Columbia University, Songchen Yao Columbia University, David Williams-King Columbia University, Vikas Ummadisetty Dublin High School, Junfeng Yang Columbia University, Baishakhi Ray Columbia University, Suman Jana Columbia University DOI | ||
08:30 30mLive Q&A | Q&A (SE & AI—Machine Learning for Software Engineering 1) Research Papers |
20:00 - 21:00 | SE & AI—Machine Learning for Software Engineering 1Research Papers Chair(s): Kelly Lyons University of Toronto, Phuong T. Nguyen University of L’Aquila | ||
20:00 10mPaper | Boosting Coverage-Based Fault Localization via Graph-Based Representation Learning Research Papers Yiling Lou Purdue University, Qihao Zhu Peking University, Jinhao Dong Peking University, Xia Li Kennesaw State University, Zeyu Sun Peking University, Dan Hao Peking University, Lu Zhang Peking University, Lingming Zhang University of Illinois at Urbana-Champaign DOI | ||
20:10 10mPaper | SynGuar: Guaranteeing Generalization in Programming by Example Research Papers Bo Wang National University of Singapore, Teodora Baluta National University of Singapore, Aashish Kolluri National University of Singapore, Prateek Saxena National University of Singapore DOI | ||
20:20 10mPaper | StateFormer: Fine-Grained Type Recovery from Binaries using Generative State Modeling Research Papers Kexin Pei Columbia University, Jonas Guan University of Toronto, Matthew Broughton Columbia University, Zhongtian Chen Columbia University, Songchen Yao Columbia University, David Williams-King Columbia University, Vikas Ummadisetty Dublin High School, Junfeng Yang Columbia University, Baishakhi Ray Columbia University, Suman Jana Columbia University DOI | ||
20:30 30mLive Q&A | Q&A (SE & AI—Machine Learning for Software Engineering 1) Research Papers |