Propagating frugal user feedback through closeness of code dependencies to improve IR-based traceability recovery
Sprache des Titels:
Traceability recovery captures trace links among different software artifacts (e.g., requirements and code) when two artifacts cover the same part of system functionalities. These trace links provide important support for developers in software maintenance and evolution tasks. Information Retrieval (IR) is now the mainstream technique for semi-automatic approaches to recover candidate trace links based on textual similarities among artifacts. The performance of IR-based traceability recovery is evaluated by the ranking of relevant traces in the generated lists of candidate links. Unfortunately, this performance is greatly hindered by the vocabulary mismatch problem between different software artifacts. To address this issue, a growing body of enhancing strategies based on user feedback is proposed to adjust the calculated IR values of candidate links after the user verifies part of these links. However, the improvement brought by this kind of strategies requires a large amount of user feedback, which could be infeasible in practice. In this paper, we propose to improve IR-based traceability recovery by propagating a small amount of user feedback through the closeness analysis on call and data dependencies in the code. Specifically, our approach first iteratively asks users to verify a small set of candidate links. The collected frugal feedback is then composed with the quantified functional similarity for each code dependency (called closeness) and the generated IR values to improve the ranking of unverified links. An empirical evaluation based on nine real-world systems with three mainstream IR models shows that our approach can outperform five baseline approaches by using only a small amount of user feedback.