[xwiki-devs] [VOTE] Add new check to measure quality of tests

15 Mar 2018

Hi devs,
As part of the STAMP research project, we’ve developed a new tool (Descartes, based on
Pitest) to measure the quality of tests. It generates a mutation score for your tests,
defining how good the tests are. Technical Descartes performs some extreme mutations on
the code under test (e.g. remove content of void methods, return true for methods
returning a boolean, etc - See https://github.com/STAMP-project/pitest-descartes). If the
test continues to pass then it means it’s not killing the mutant and thus its mutation
score decreases.
So in short:
* Jacoco/Clover: measure how much of the code is tested
* Pitest/Descartes: measure how good the tests are
Both provide a percentage value.
I’m proposing to compute the current mutation scores for xwiki-commons and xwiki-rendering
and fail the build when new code is added that reduce the mutation score threshold
(exactly the same as our jacoco threshold and strategy).
I consider this is an experiment to push the limit of software engineering a bit further.
I don’t know how well it’ll work or not. I propose to do the work and test this for over
2-3 months and see how well it works or not. At that time we can then decide whether it
works or not (i.e whether the gains it brings are more important than the problems it
causes).
Here’s my +1 to try this out.
Some links:
* pitest: http://pitest.org/
* descartes: https://github.com/STAMP-project/pitest-descartes
* http://massol.myxwiki.org/xwiki/bin/view/Blog/ControllingTestQuality
* http://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
If you’re curious, you can see a screenshot of a mutation score report at
http://massol.myxwiki.org/xwiki/bin/download/Blog/MutationTestingDescartes/…
Please cast your votes.
Thanks
-Vincent

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[xwiki-devs] [VOTE] Add new check to measure quality of tests