+1
Thanks,
Alex
On Mar 15, 2018 12:26, "Thomas Mortagne" <thomas.mortagne(a)xwiki.com>
wrote:
+1
On Thu, Mar 15, 2018 at 9:30 AM, Vincent Massol <vincent(a)massol.net> wrote:
Hi devs,
As part of the STAMP research project, we’ve developed a new tool
(Descartes, based
on Pitest) to measure the quality of tests. It generates
a mutation score for your tests, defining how good the tests are. Technical
Descartes performs some extreme mutations on the code under test (e.g.
remove content of void methods, return true for methods returning a
boolean, etc - See
https://github.com/STAMP-project/pitest-descartes). If
the test continues to pass then it means it’s not killing the mutant and
thus its mutation score decreases.
So in short:
* Jacoco/Clover: measure how much of the code is tested
* Pitest/Descartes: measure how good the tests are
Both provide a percentage value.
I’m proposing to compute the current mutation scores for xwiki-commons
and
xwiki-rendering and fail the build when new code is added that reduce
the mutation score threshold (exactly the same as our jacoco threshold and
strategy).
I consider this is an experiment to push the limit of software
engineering a bit
further. I don’t know how well it’ll work or not. I
propose to do the work and test this for over 2-3 months and see how well
it works or not. At that time we can then decide whether it works or not
(i.e whether the gains it brings are more important than the problems it
causes).
http://massol.myxwiki.org/xwiki/bin/download/Blog/MutationTestingDescartes/
report.png
Please cast your votes.
Thanks
-Vincent
--
Thomas Mortagne