Abstract: Software Product Line (SPL) testing has been considered a challenging task, mainly due to the diversity of products that might be generated from an SPL. To deal with this problem, several techniques for specifying and deriving product specific functional test cases have been proposed. However, there is not much empirical evidence of the benefits and drawbacks of these techniques. To provide this kind of evidence, we conduct studies that compare two design techniques for blackbox manual tests, a generic technique that we have observed in an industrial test execution environment, and a product specific technique whose functional test cases could be derived using any SPL technique that considers variations in functional tests. We evaluate their impact from the point of view of the test execution process, obtaining results that indicate that executing product specific test cases is faster and generates fewer errors.
In the link below you can download all the material used to perform both experiments. This material includes the RGMS products with instructions on how to run them, the test suites used (written in portuguese), the Testwatcher tool and the training and dry run material (also written in portuguese).
Test Environment and the database installer we use
Here you can download the data that we collected during the experiments. This material includes the sheets generated by Testwatcher and the CRs reported by each subject.
Here we provide the script to run our data analysis with the R scripts and the data files used. Also, we display here some results and graphics that we couldn't present in the JUCS paper due to lack of space.
Below we see each experiment box plot.
After the descriptive analysis we proceeded to run the hypothesis test. Because we used the Latin square design, we created an effect model that models our response variable (execution time). This models states that the response variable is the result of the sums of the influence factors (latin square replica, subjects, features and technique) considered by our experiment plus the residual. With this effect model we can run an ANOVA test to check if the tendency observed by the descriptive graphics is statistically significant. But before running the ANOVA, we first needed to run some tests to check if we could confirm the ANOVA assumptions to our data.
First we checked the assumption of equality or homogeneity of variances, that is, the variance of data in groups should be the same. Below we can see the Box Cox test which gives a significancy of 95% that our model residuals maintain a constant variance. We can see that because the interval which maximizes the function (above the 95% line) contains the value 1.
The second assumption that we examined was if the distribution of the residuals followed a normal distribution. We ran the ShapiroWilk hypothesis test to examine this property. It tests the null hypothesis that the data set follows a normal distribution. If it provides a high pvalue we cannot reject this hypothesis. With a level of 95% of significance we couldn't reject the null hypothesis in neither experiments. In the first experiment the pvalue was 0.1456, and, in the second one, the pvalue was 0.4659.
The last property that we wanted to investigate was if our model was additive, that is, there was no interaction between our control factors. So we ran the Tukey Test of Additivity which tests the null hypothesis stating that the model is additive. One more time, we had high pvalues for both experiments (0.5743 in the first one and 0.7976 in the second one) hence we cannot reject the null hypothesis the our model is indeed additive.
Finally we ran the ANOVA test to examine whether the technique factor had a significant impact on the execution time. This time the null hypothesis stated that there was no significant difference between the execution time means achieved in GT and in ST. Again we used a 95% level of significance to compare the pvalue and in both experiments (0.0001 in the first one and 0.0109 in the second one) we were able to reject the null hypothesis. Our conclusion is that, within the scope of our studies, there is a significant difference between the GT and the ST execution time means. In addition, ST showed smaller values than GT.
In case of any problem, please contact one of the following:
MyForm  

29 Jan.  EMN/ARMINES, ULANC, UMA, FCT/UNL 
12 Feb.  HOLOS, EMN/ARMINES 
26 Feb. 

I  Attachment  Action  Size  Date  Who  Comment 

zip  1st_Exp_Suites.zip  manage  81.1 K  20120327  21:36  UnknownUser  
zip  1st_exp_data.zip  manage  193.4 K  20130829  16:33  UnknownUser  
zip  2nd_exp_data.zip  manage  131.1 K  20130829  16:59  UnknownUser  
zip  3rddata.zip  manage  149.0 K  20120515  10:56  UnknownUser  
zip  4th_exp_data.zip  manage  118.2 K  20120327  21:51  UnknownUser  
zip  5th_exp_data.zip  manage  999.4 K  20120327  21:56  UnknownUser  
zip  Data.zip  manage  965.6 K  20120515  10:19  UnknownUser  
How_to_run_RGMS.pdf  manage  349.3 K  20120301  19:40  UnknownUser  
zip  Instructions.zip  manage  1306.4 K  20130828  17:29  UnknownUser  
zip  TestEnvironment.zip  manage  8199.5 K  20130828  17:54  UnknownUser  
zip  TestSuites.zip  manage  61.9 K  20130828  19:36  UnknownUser  
zip  analysis_data.zip  manage  2.9 K  20130829  17:21  UnknownUser  
png  boxcoxs.png  manage  46.0 K  20130806  22:36  UnknownUser  
png  boxplots.png  manage  21.6 K  20130806  22:34  UnknownUser  
zip  data_collected_2_.zip  manage  987.9 K  20130806  19:23  UnknownUser  
png  dotplots.png  manage  82.5 K  20130806  22:35  UnknownUser  
prga_msc_dissertation_corrected.pdf  manage  869.2 K  20120601  10:40  UnknownUser  
zip  Data_Analysis.zip  manage  2.0 K  20120301  19:06  UnknownUser  Data Analysis 
zip  Data_Collected.zip  manage  17.1 K  20120301  19:03  UnknownUser  Data Collected 
zip  hsqldb2.2.5.zip  manage  8006.2 K  20120301  19:34  UnknownUser  hsqldb 
zip  RGMS_Products.zip  manage  8017.6 K  20120301  19:37  UnknownUser  RGMS P1 and P2 
zip  TestWatcher.zip  manage  12.3 K  20120301  19:11  UnknownUser  TestWatcher 
mersin escort bayan adana escort bayan izmit escort ankara escort bursa escort