Understanding Merge Conflicts Characteristics (Website under construction)
Abstract
Previous studies show that merge conflicts frequently occur in collaborative
development environments, and often impairs developers' productivity, since merging contributions is a demanding and tedious task. However, the structure of the changes that lead to conflicts has not been studied yet. Understanding conflicts underlying structure and the involved syntactic language elements might shed light on how to better avoid and resolve conflicts. So, in this paper we derive a catalog of conflict patterns expressed in terms of code changes that lead to conflicts. To assess the occurrence of such patterns in open-source systems, we conducted an empirical study that reproduces 56819 merges from 128 GitHub projects. We focus on conflicts reported by a semistructured merge tool, avoiding a large number of spurious conflicts often reported by typical unstructured tools that are still used in practice. We found out that most merge conflicts happen because developers independently edit the same lines of the same methods. Furthermore, we noticed that copying and pasting pieces of code, or even entire files, across different repositories is a common practice. We also analyze how our results reveal the need for new research studies and potential improvements to tools that better support collaborative software development.
If you have any questions please contact:
Paola Accioly - prga at cin.ufpe.br
Paulo Borba - phmb at cin.ufpe.br
Guilherme Cavalcanti - gjcc at cin.ufpe.br
Paper
Link to our paper
This is a post-peer-review, pre-copyedit version of an article published in The Empirical Software Engineering Journal. The final authenticated version is available online at: https://doi.org/10.1007/s10664-017-9586-1
Results -- Graphs and Tables (HTML generated by our R scripts)
Conflict Pattern Results Here
Normalized Conflict Results Here
Data
All data we collected during our experiment is available in what follows.
Conflict Patterns collected
Normalized Patterns collected
Tools
The tools used to run our experiment, including the mining and the execution step are available
here.
Sample Systems
Description of our sample
The Conflict Pattern Catalog
Bellow you can see one example of an instance of each pattern.
Different edits to the same area of the same method or constructor
Acronym:
EditSameMC
Example:
Methods or constructors added with the same signature and different bodies
Acronym:
SameSignatureCM
Example:
Different edits to the same field declaration
Acronym:
EditSameFd
Example:
Field declarations added with the same identifiers and different types of modifiers
Acronym:
AddSameFd
Example:
Different edits to the modifier list of the same type declaration (class, interface, annotation or enum types)
Acronym:
ModifierList
Example:
Different edits to the same implements declaration
Acronym:
ImplementsList
Example:
Different edits to the same extends declaration
Acronym:
ExtendsList
Example:
Different edits to the same annotation method default value
Acronym: DefaultValueA
Example:
Sample systems