SPLASH 2014
Mon 20 - Fri 24 October 2014 Portland, Oregon, United States
Wed 22 Oct 2014 11:37 - 12:00 at Salon E - Program Analysis and the Web Chair(s): Stephen Chong

Several program analysis tools—such as plagiarism detection and bug finding—rely on knowing a piece of code’s relative semantic importance. For example, a plagiarism detector should not bother reporting two programs that have an identical simple loop counter test, but should report programs that share more distinctive code. Traditional program analysis techniques (e.g., finding data and control dependencies) are useful, but do not say how surprising or common a line of code is. Natural language processing researchers have encountered a similar problem and addressed it using an n-gram model of text frequency, derived from statistics computed over text corpora.

We propose and compute an n-gram model for programming languages, computed over a corpus of 2.8 million JavaScript programs we downloaded from the Web. In contrast to previous techniques, we describe a code n-gram as a subgraph of the program dependence graph that contains all nodes and edges reachable in n steps from the statement. We can count n-grams in a program and count the frequency of n-grams in the corpus, enabling us to compute tf-idf-style measures that capture the differing importance of different lines of code. We demonstrate the power of this approach by implementing a plagiarism detector with accuracy that beats previous techniques, and a bug-finding tool that discovered over a dozen previously unknown bugs in a collection of real deployed programs.

Wed 22 Oct

Displayed time zone: Tijuana, Baja California change

10:30 - 12:00
Program Analysis and the WebOOPSLA at Salon E
Chair(s): Stephen Chong Harvard University
10:30
22m
Talk
Checking Correctness of TypeScript Interfaces for JavaScript Libraries
OOPSLA
Asger Feldthaus Aarhus University, Anders Møller Aarhus University
Link to publication
10:52
22m
Talk
Determinacy in Static Analysis for jQuery
OOPSLA
Esben Andreasen Aarhus University, Anders Møller Aarhus University
Link to publication
11:15
22m
Talk
EventBreak: Analyzing the Responsiveness of User Interfaces through Performance-Guided Test Generation
OOPSLA
Michael Pradel University of California, Berkeley, USA, Parker Schuh University of California, Berkeley, George Necula University of California, Berkeley, Koushik Sen University of California, Berkeley
Link to publication
11:37
22m
Talk
Using Web Corpus Statistics for Program Analysis
OOPSLA
Chun-Hung Hsiao University of Michigan, Michael Cafarella University of Michigan, Satish Narayanasamy University of Michigan
Link to publication