Clustering Natural Language Test Case Instructions as Input for Deriving Automotive Testing DSLs

By: Katharina Juhnke, Alexander Nikic, Matthias Tichy


System testing is an important quality assurance technique in the area of automotive software development where predominantly natural language test cases are used for testing prototype vehicles. To ensure that these test cases are able to identify faults, the test cases themselves must also be of high quality. Testing DSLs can improve the quality of the test cases. To support the smooth introduction of Testing DSLs into the industry, the reuse of system-specific terminology and syntax of existing natural language test case specifications is recommended. Consequently, it is necessary to identify and cluster highly similar domain-specific instructions used in the action and expected result descriptions from those specifications. This is a key activity in automating the development of Testing DSLs to enable the application of grammar inference approaches. However, with an average of 400 – 500 test cases per specification, this is a time-consuming task when executed manually. We present a clustering approach based on the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm to automatically cluster similar instructions. Due to the special structure of the instructions used in our industrial test case specifications, we further developed a specific distance function required by the DBSCAN algorithm. Additionally, we determined appropriate values for the DBSCAN’s parameters MinPts and Eps. Our evaluation on three industrial test case specifications shows that our approach is suitable for automatically clustering instructions from those test case specifications due to an almost perfect agreement (κ > 0.81) between clusters, created manually by experts, and the automatically created clusters. Furthermore, we show how to use the Multiple Sequence Alignment (MSA) approach to automatically derive grammar suggestions from the clustered instructions.


Automotive, Test case templates, DSL, DBSCAN, Distance function, MSA.

Cite as:

Katharina Juhnke, Alexander Nikic, Matthias Tichy, “Clustering Natural Language Test Case Instructions as Input for Deriving Automotive Testing DSLs”, Journal of Object Technology, Volume 20, no. 3 (June 2021), pp. 5:1-14, doi:10.5381/jot.2021.20.3.a5.

PDF | DOI | BiBTeX | Tweet this | Post to CiteULike | Share on LinkedIn

The JOT Journal   |   ISSN 1660-1769   |   DOI 10.5381/jot   |   AITO   |   Open Access   |    Contact