Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering
Thu 26 Aug 2021 21:10 - 21:20 - Analytics & Software Evolution—Mining Software Repositories Chair(s): Phuong T. Nguyen, Venera Arnaoudova
Authorship attribution (i.e., determining who is the author of a piece of source code) is an established research topic. State-of-the-art results for the authorship attribution problem look promising for the software engineering field, where they could be applied to detect plagiarized code and prevent legal issues.
With this article, we first introduce a new language-agnostic approach to authorship attribution of source code.
Then, we discuss limitations of existing synthetic datasets for authorship attribution, and propose a data collection approach that delivers datasets that better reflect aspects important for potential practical use in software engineering.
Finally, we demonstrate that high accuracy of authorship attribution models on existing datasets drastically drops when they are evaluated on more realistic data. We outline next steps for the design and evaluation of authorship attribution models that could bring the research efforts closer to practical use for software engineering.
Thu 26 AugDisplayed time zone: Athens change
09:00 - 10:00 | Analytics & Software Evolution—Mining Software RepositoriesJournal First / Research Papers +12h Chair(s): Juri Di Rocco University of L'Aquila | ||
09:00 10mPaper | Characterizing Search Activities on Stack Overflow Research Papers Jiakun Liu Zhejiang University, Sebastian Baltes University of Adelaide, Christoph Treude University of Adelaide, David Lo Singapore Management University, Yun Zhang Zhejiang University City College, Xin Xia Huawei Technologies DOI | ||
09:10 10mPaper | Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering Research Papers Egor Bogomolov JetBrains Research; HSE University, Vladimir Kovalenko JetBrains Research, Yurii Rebryk HSE University, Alberto Bacchelli University of Zurich, Timofey Bryksin JetBrains Research; HSE University DOI Pre-print | ||
09:20 5mPaper | Insights into Non-Merged Pull Requests in GitHub: Is there Evidence of Bias Based on Perceptible Race Journal First Reza Nadri University of Waterloo, Gema Rodríguez-Pérez University of Waterloo, Mei Nagappan University of Waterloo | ||
09:25 5mPaper | Automatic Recovery of Issue Type Labels Journal First Farida El Zanaty McGill University, Christophe Rezk McGill University, Sander Lijbrink Shopify, Inc., Willem Van Bergen Shopify, Inc., Mark Côté Shopify, Inc., Shane McIntosh McGill University | ||
09:30 30mLive Q&A | Q&A (Analytics & Software Evolution—Mining Software Repositories) Research Papers |
21:00 - 22:00 | Analytics & Software Evolution—Mining Software RepositoriesResearch Papers / Journal First Chair(s): Phuong T. Nguyen University of L’Aquila, Venera Arnaoudova Washington State University | ||
21:00 10mPaper | Characterizing Search Activities on Stack Overflow Research Papers Jiakun Liu Zhejiang University, Sebastian Baltes University of Adelaide, Christoph Treude University of Adelaide, David Lo Singapore Management University, Yun Zhang Zhejiang University City College, Xin Xia Huawei Technologies DOI | ||
21:10 10mPaper | Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering Research Papers Egor Bogomolov JetBrains Research; HSE University, Vladimir Kovalenko JetBrains Research, Yurii Rebryk HSE University, Alberto Bacchelli University of Zurich, Timofey Bryksin JetBrains Research; HSE University DOI Pre-print | ||
21:20 5mPaper | Insights into Non-Merged Pull Requests in GitHub: Is there Evidence of Bias Based on Perceptible Race Journal First Reza Nadri University of Waterloo, Gema Rodríguez-Pérez University of Waterloo, Mei Nagappan University of Waterloo | ||
21:25 5mPaper | Automatic Recovery of Issue Type Labels Journal First Farida El Zanaty McGill University, Christophe Rezk McGill University, Sander Lijbrink Shopify, Inc., Willem Van Bergen Shopify, Inc., Mark Côté Shopify, Inc., Shane McIntosh McGill University | ||
21:30 30mLive Q&A | Q&A (Analytics & Software Evolution—Mining Software Repositories) Research Papers |