Write a Blog >>
ESEC/FSE 2021
Thu 19 - Sat 28 August 2021 Clowdr Platform

As code search permeates most activities in software development,code-to-code search has emerged to support using code as a query and retrieving similar code in the search results. Applications include duplicate code detection for refactoring, patch identification for program repair, and language translation. Existing code-to-code search tools rely on static similarity approaches such as the comparison of tokens and abstract syntax trees (AST) to approximate dynamic behavior, leading to low precision. Most tools do not support cross-language code-to-code search, and those that do, rely on machine learning models that require labeled training data.
We present Code-to-Code Search Across Languages (COSAL), a cross-language technique that uses both static and dynamic analyses to identify similar code and does not require a machine learning model. Code snippets are ranked using non-dominated sorting based on code token similarity, structural similarity, and behavioral similarity. We empirically evaluate COSAL on two datasets of 43,146Java and Python files and 55,499 Java files and find that 1) code search based on non-dominated ranking of static and dynamic similarity measures is more effective compared to single or weighted measures; and 2) COSAL has better precision and recall compared to state-of-the-art within-language and cross-language code-to-code search tools. We explore the potential for using COSAL on large open-source repositories and discuss scalability to more languages and similarity metrics, providing a gateway for practical,multi-language code-to-code search.

Wed 25 Aug

Displayed time zone: Athens change

08:00 - 09:00
Analytics & Software Evolution—Code RecommendationJournal First / Research Papers +12h
Chair(s): Davide Di Ruscio University of L'Aquila, Saikat Chakraborty Columbia University
08:00
10m
Paper
Cross-Language Code Search using Static and Dynamic AnalysesArtifacts Available
Research Papers
George Mathew North Carolina State University, Kathryn Stolee North Carolina State University
DOI
08:10
10m
Paper
Automating the Removal of Obsolete TODO Comments
Research Papers
Zhipeng Gao Monash University, Xin Xia Huawei Technologies, David Lo Singapore Management University, John Grundy Monash University, Thomas Zimmermann Microsoft Research
DOI
08:20
10m
Paper
Generating Question Titles for Stack Overflow from Mined Code Snippets
Journal First
Zhipeng Gao Monash University, Xin Xia Huawei Technologies, John Grundy Monash University, David Lo Singapore Management University, Yuan-Fang Li Monash University
08:30
30m
Live Q&A
Q&A (Analytics & Software Evolution—Code Recommendation)
Research Papers

20:00 - 21:00
Analytics & Software Evolution—Code RecommendationResearch Papers / Journal First
Chair(s): Davide Di Ruscio University of L'Aquila, Saikat Chakraborty Columbia University
20:00
10m
Paper
Cross-Language Code Search using Static and Dynamic AnalysesArtifacts Available
Research Papers
George Mathew North Carolina State University, Kathryn Stolee North Carolina State University
DOI
20:10
10m
Paper
Automating the Removal of Obsolete TODO Comments
Research Papers
Zhipeng Gao Monash University, Xin Xia Huawei Technologies, David Lo Singapore Management University, John Grundy Monash University, Thomas Zimmermann Microsoft Research
DOI
20:20
10m
Paper
Generating Question Titles for Stack Overflow from Mined Code Snippets
Journal First
Zhipeng Gao Monash University, Xin Xia Huawei Technologies, John Grundy Monash University, David Lo Singapore Management University, Yuan-Fang Li Monash University
20:30
30m
Live Q&A
Q&A (Analytics & Software Evolution—Code Recommendation)
Research Papers