On 05/31/2011 04:54 PM, Vincent Massol wrote:
Hi devs,
Here's the situation:
* We've been having a hard time releasing on time and the main issue is test
stability. We lag by at least a week and we even release with failing tests, causing
regressions.
* It's not the role of the release manager to fix tests before releasing
* It's not normal that some people spend time fixing issues caused by others and that
others continue to work on the next thing they are working on. Everyone needs to help.
Here's what I propose as a drastic and temporary measure till we get better:
1) It's forbidden to commit anything till all tests are passing (unit AND functional
tests), unless what is committed is about fixing tests. On the exceptional case when a
committer absolutely needs to commit even though tests are failing he needs to ask the
permission explicitly.
-0. Running all tests takes a lot of time, we can't wait 2 hours after
each change to check that it works before committing.
-1 for forcing everyone to stop their work when somebody else broke some
tests.
+1 for a different strategy: Whoever commits something should check the
next build, and if something fails, he should stop what he's doing and
work on fixing the problem. The build should not stay broken for more
than 6 working hours.
2) When tests are failing, everyone should stop what
they're doing and help stabilize again. We synchronize on IRC.
+1, but for this to work we have to reach a stable state by default,
meaning that a failed build should be a rare exception.
3) Flickering tests can be marked as @Ignore and a
jira issue created to stabilize the build.
+1
4) Release Manager creates a release branch 1 week
before the release to let everyone stabilize the build
+0, the term should not be strict. If all tests are passing, then
there's no need for such a long stabilization period. If releases should
happen on Monday, then the branch could be created Friday morning.
On a long term we need to work on improving our CI so that functional tests are built
faster. One idea is: more agents and functional tests spread on several agents.
+1.
Here's my +1 to apply this now for master
(3.2-SNAPSHOT leading to 3.2M1), which means not committing anything more till we have all
functional tests passing.
Thanks
-Vincent
--
Sergiu Dumitriu
http://purl.org/net/sergiu/