Write a Blog >>
ESEC/FSE 2021
Thu 19 - Sat 28 August 2021 Clowdr Platform

To solve programming issues, developers commonly search on Stack Overflow to seek potential solutions. However, there is a gap between the knowledge developers are interested in and the knowledge they are able to retrieve using search engines. To help developers efficiently retrieve relevant knowledge on Stack Overflow, prior studies proposed several techniques to reformulate queries and generate summarized answers. However, few studies performed a large-scale analysis using real-world search logs. In this paper, we characterize how developers search on Stack Overflow using such logs. By doing so, we identify the challenges developers face when searching on Stack Overflow and seek opportunities for the platform and researchers to help developers efficiently retrieve knowledge.
To characterize search activities on Stack Overflow, we use search log data based on requests to Stack Overflow's web servers. We find that the most common search activity is reformulating the immediately preceding queries. Related work looked into query reformulations when using generic search engines and found 13 types of query reformulation strategies. Compared to their results, we observe that 71.78% of the reformulations can be fitted into those reformulation strategies. In terms of how queries are structured, 17.41% of the search sessions only search for fragments of source code artifacts (e.g., class and method names) without specifying the names of programming languages, libraries, or frameworks. % By analyzing the search log of Stack Overflow, we observe that developers use queries with different content for different types of search intentions. Based on our findings, we provide actionable suggestions for Stack Overflow moderators and outline directions for future research. For example, we encourage Stack Overflow to set up a database that includes the relations between all computer programming terminologies shared on Stack Overflow, e.g., method name, data structure name, design pattern, and IDE name. By doing so, Stack Overflow could improve the performance of search engines by considering related programming terminologies at different levels of granularity.

Thu 26 Aug

Displayed time zone: Athens change

09:00 - 10:00
Analytics & Software Evolution—Mining Software RepositoriesJournal First / Research Papers +12h
Chair(s): Juri Di Rocco University of L'Aquila
09:00
10m
Paper
Characterizing Search Activities on Stack Overflow
Research Papers
Jiakun Liu Zhejiang University, Sebastian Baltes University of Adelaide, Christoph Treude University of Adelaide, David Lo Singapore Management University, Yun Zhang Zhejiang University City College, Xin Xia Huawei Technologies
DOI
09:10
10m
Paper
Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering
Research Papers
Egor Bogomolov JetBrains Research; HSE University, Vladimir Kovalenko JetBrains Research, Yurii Rebryk HSE University, Alberto Bacchelli University of Zurich, Timofey Bryksin JetBrains Research; HSE University
DOI Pre-print
09:20
5m
Paper
Insights into Non-Merged Pull Requests in GitHub: Is there Evidence of Bias Based on Perceptible Race
Journal First
Reza Nadri University of Waterloo, Gema Rodríguez-Pérez University of Waterloo, Mei Nagappan University of Waterloo
09:25
5m
Paper
Automatic Recovery of Issue Type Labels
Journal First
Farida El Zanaty McGill University, Christophe Rezk McGill University, Sander Lijbrink Shopify, Inc., Willem Van Bergen Shopify, Inc., Mark Côté Shopify, Inc., Shane McIntosh McGill University
09:30
30m
Live Q&A
Q&A (Analytics & Software Evolution—Mining Software Repositories)
Research Papers

21:00 - 22:00
Analytics & Software Evolution—Mining Software RepositoriesResearch Papers / Journal First
Chair(s): Phuong T. Nguyen University of L’Aquila, Venera Arnaoudova Washington State University
21:00
10m
Paper
Characterizing Search Activities on Stack Overflow
Research Papers
Jiakun Liu Zhejiang University, Sebastian Baltes University of Adelaide, Christoph Treude University of Adelaide, David Lo Singapore Management University, Yun Zhang Zhejiang University City College, Xin Xia Huawei Technologies
DOI
21:10
10m
Paper
Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering
Research Papers
Egor Bogomolov JetBrains Research; HSE University, Vladimir Kovalenko JetBrains Research, Yurii Rebryk HSE University, Alberto Bacchelli University of Zurich, Timofey Bryksin JetBrains Research; HSE University
DOI Pre-print
21:20
5m
Paper
Insights into Non-Merged Pull Requests in GitHub: Is there Evidence of Bias Based on Perceptible Race
Journal First
Reza Nadri University of Waterloo, Gema Rodríguez-Pérez University of Waterloo, Mei Nagappan University of Waterloo
21:25
5m
Paper
Automatic Recovery of Issue Type Labels
Journal First
Farida El Zanaty McGill University, Christophe Rezk McGill University, Sander Lijbrink Shopify, Inc., Willem Van Bergen Shopify, Inc., Mark Côté Shopify, Inc., Shane McIntosh McGill University
21:30
30m
Live Q&A
Q&A (Analytics & Software Evolution—Mining Software Repositories)
Research Papers