Ein paralleler Algorithmus für API Mining von C# Code

Authors

  • Frank Herrmann Ostbayerische Technische Hochschule Regensburg Innovationszentrum für Produktionslogistik und Fabrikplanung
  • Robert Horkovics-Kovats M. Sc. Capgemini SE, Bahnhofstraße 30, 90402 Nürnberg
  • Eldar Dr. Sultanow Capgemini SE, Bahnhofstraße 30, 90402 Nürnberg

DOI:

https://doi.org/10.26034/lu.akwi.2022.3454

Keywords:

Paralleler Algorithmus, API Mining, C# Code

Abstract

Conformance analysis is a static code analysis (SCA) technique for software quality assurance. Its core problem is that tools do not learn automatically from errors that have already occurred. To solve it, this work evaluated machine learning (ML) by applying a scientifically sound and practically proven approach to the unsupervised learning technique and analyzing the result. It was found that to apply it to different programming languages, only a language-specific API mining tool is needed. Such a tool searches lines of code in parallelized form and normalizes them for machine learning. This system was implemented for the C# programming language, since many industrial projects are developed in this language. For functional validation, a case study showed that rules were learned with a positive effect on software quality. Specifically, the maintenance overhead of a code smell in an example project was reduced by a factor of 30 by offloading a learned association into a common method. The runtime of the algorithm was empirically evaluated in eight open-source repositories. An average runtime improvement of 45.16% can be expected by parallelization. However, limitations also became apparent in the application: many associations are useless, rule evaluation depends on a subjective factor, and the economics of the tool are therefore not transparent. Nevertheless, this work proves that an ML-based SCA tool is feasible as a complementary quality assurance measure in software engineering.

Downloads

Published

2022-12-24

Issue

Section

Practice