Assessing Semistructured Merge in Version Control Systems: A Replicated Experiment
Guilherme Cavalcanti,
Paola Accioly, and
Paulo Borba.
In Proceedings of the
9th International Symposium on Empirical Software Engineering and Measurement (ESEM 2015).
PREPRINT
PRESENTATION
Abstract:
Context: To reduce the integration effort arising from conflicting changes resulting from collaborative software development tasks, unstructured merge tools try to automatically solve part of the conflicts via textual similarity, whereas structured and semistructured merge tools try to go further by exploiting the syntactic structure of the involved artifacts.
Objective: In this study, aiming at increasing the existing body of evidence and assessing results for systems developed under an alternative version control paradigm, we replicate an experiment conducted by
Apel et al. to compare the unstructured and semistructured approach with respect to the occurrence of conflicts reported by both approaches.
Method: We used both semistructured and unstructured merge in a sample 2.5 times bigger than the original study regarding the number of projects and 18 times bigger regarding the number of merge scenarios, and we compared the occurrence of conflicts.
Results: Similar to the original study, we observed that semistructured merge reduces the number of conflicts in 55% of the scenarios of the new sample. However, the observed average conflict reduction of 62% in these scenarios is far superior than what has been observed before. We also bring new evidence that the use of semistructured merge can reduce the occurrence of conflicting merge scenarios by half.
Conclusions: Our findings reinforce the benefits of exploiting the syntactic structure of the artifacts involved in code integration. Besides, the reductions observed in the number and size of conflicts suggest that the use of semistructured merge,when compared to the unstructured approach, might decrease integration effort without compromising correctness.
Replication Design
The study design is composed by a
mining step, which is different from the original study since we are exploring DVCS repositories instead of CVCS ones; and by a
execution step, which is similar to the original study, since we use the tool and scripts provided by the original authors. In particular, in the mining step, we built tools that mine DVCS repositories to collect a number of merge scenarios. Subsequently, in the execution step, we use a prototype of the semistructured approach in order to run the selected merge scenarios using both merge approaches, and R scripts, to collect metrics on the number of textual conflicts, conflicting lines of code, conflicting files and semantic conflicts.
Data, Scripts and Tools
Evaluation Results
--
GuilhermeCavalcanti - 2015-04-17