XCorpus – An executable Corpus of Java Programs
By: Jens Dietrich, Henrik Schole, Li Sui, Ewan Tempero
Abstract
Empirical studies on code require standardized datasets of significant size extracted from real-world programs in order to be reproducible and generalisable. We argue that there is a need for such data sets that are executable and can therefore be used for experiments using static and dynamic analysis. A harness for such a data set should have high coverage in order to facilitate the construction of comprehensive models of program execution. We present XCorpus, a set of 76 executable, real-world Java programs, including a subset of 70 programs from the Qualitas Corpus. XCorpus uses a harness that is a combination of built-in and generated test cases, resulting in a branch coverage that is significantly better than what is available from DaCapo.
Keywords
data set, benchmark, Java, empirical study, program analysis, test case generation, test coverage, dynamic program analysis
Cite as:
Jens Dietrich, Henrik Schole, Li Sui, Ewan Tempero, “XCorpus – An executable Corpus of Java Programs”, Journal of Object Technology, Volume 16, no. 4 (August 2017), pp. 1:1-24, doi:10.5381/jot.2017.16.4.a1.
PDF | DOI | BiBTeX | Tweet this | Post to CiteULike | Share on LinkedIn