When do Software Complexity Metrics Mean Nothing? – When Examined out of Context

Abstract

This paper places its attention on a familiar phenomena: that code metrics such as lines of code are extremely context dependent and their distribution differs from project to project. We apply visual inspection, as well as statistical reasoning and testing, to show that such metric values are so sensitive to context, that their measurement in one project offers little prediction regarding their measurement in another project. On the positive side, we show that context bias can be neutralized, at least for the majority of metrics that we considered, by what we call Log Normal Standardization} (LNS). Concretely, the LNS transformation is obtained by shifting (by subtracting the mean) and scaling (by dividing by the standard deviation) of the log of a metric value. Thus, we conclude that the LNS-transformed-, are to be preferred over the plain-, values of metrics, especially in comparing modules from different projects. Conversely, the LNS-transformation suggests that the ‟context bias” of a software project with respect to a specific metric can be summarized with two numbers: the mean of the logarithm of the metric value, and its standard deviation.

Keywords

Metrics

Cite as:

Joseph (Yossi) Gil, Gal Lalouche, “When do Software Complexity Metrics Mean Nothing? – When Examined out of Context”, Journal of Object Technology, Volume 15, no. 1 (February 2016), pp. 2:1-25, doi:10.5381/jot.2016.15.1.a2.