Vincent,
I thought that your proposal was about enforcing the rules you have
explained in your previous thread, not mainly about maintaining the CI
system working properly for their purpose which is to check the stability
and quality of XWiki.
This seems to me really different. And from our recent talk on IRC, I
understand that those working to that maintenance tasks like to share the
job. But this does not seems to me like a weekly job that anyone can
undertake easily.
I should admit that I have not enough knowledge of the CI system currently,
not that I do not want, but that time is lacking. This is also why I have
made very few work recently, since I need to invest more time in the test
part of XWiki to be able to work a something more important. Is there any
document sharing your paste experience of the build system that would help
the Build Manager doing his job ? I feel myself completely incompetent for
such job currently.
For that part of the Job, I am fully +1, we need a well maintained CI
system.
For the part about hassling others to fix their last commit, this should
simply not happen, and this is the reason of my previous mail. Enforcing the
rules should simply be the job of all of us once we have vote for it. About
tests that start flickering due to changes in the environment or whatever,
this is like any other issue, it should be added to JIRA, probably with a
high priority. The release manager could than track the responsible and ask
for help if these were not fixed or ignored for the release.
Hope that I have better express my opinion, and sorry for my lack of
detailled knowledge of the CI system.
Denis
On Thu, Jun 9, 2011 at 14:27, Vincent Massol <vincent(a)massol.net> wrote:
Hi Denis,
On Jun 9, 2011, at 2:11 PM, Denis Gervalle wrote:
Hi committers,
Even if I completely understand and agree with the goal pursued, I
really dislike this way of solving it. It is the responsability of all
of us to keep the build stable at all time. It should be the concerns
of all recent committers to check if their recent commit breaks the
CI, and to fix it ASAP when needed.
Nobody has suggested to change this and this is still the rule.
If someone is not ready to do so
in the upcomming hours following its commit, he should simply refrain
to do that commit until he can follow it into the CI. I always follow
these rules (even if most of the time, I commit only stuff that I have
in production on my side), and I should admit that I would have commit
more without them. But this is the necessary trade-off between
evolutivity and stability.
Again this has always been the rule and will continue to be the rule.
Having someone to enforce such rules is admitting
that some of us
needs another one to remind them the best practices.
Not at all. The reason you don't see the need is because you've not been
active on helping with maintaing our build I guess. I'll list some build
issues that happen frequently:
* Agents stop working. For ex this morning there were 2 jobs stuck because
of a failure to start FF. This happens from time to time. Someone needs to
investigate and at the very least kill the job to free the agent.
* New versions of jenkins fixing bugs. Someone needs to check and upgrade
the build when a new version has interesting stuff for us and when it fixes
some of our issues.
* The most important ones: flickering tests. I can tell for sure that you
haven't written UI tests or you'd have introduced flickering tests for sure
:) I have myself introduced several. Of course we always test locally before
committing and I even check that it works on jenkins. But flickering don't
fail immediately, they'll run fine for 50 or 100 iterations and suddenly
start to fail. It's not always easy to find who's the culprit on flickering
tests.
* Various issues with filesystem locks, permissions on agents, memory
settings, etc that make the build fail
BTW there are also other build tasks such as:
* Upgrade versions of maven plugins we use when there are new versions
* Fix TODOs in pom.xml which are workaround waiting for maven issues to be
fixed
You seem to think these are done automagically. The reason you think this
is because people like Thomas or myself (and even others but I think Thomas
and me have been the most active on this) are doing. Personally it's not my
role to be the Build Manager and I'm fed up of doing it. I want to share the
workload with everyone.
In order to address these problems you cannot just rely on the good will of
everyone to work on them. You need someone responsible. Rather than having a
single person responsible all the time, I'm proposing that committers take
turn to do that.
This is not for
me the philosophy behind Open-Source project, where everyone should do
their best for the wellness of the project.
Ok so tell me the last time you helped fix failing builds? :) IMO it's a
long time ago which means you don't do the best for the wellness of the
project (according to what you said) ... :)
(Surely you agree that the specific issues I've listed above are nobody's
faults).
So I could not admit there
is a real need for this, and I really hope that everyone of us will
understand the needs to move their cursor towards the stability of the
build.
That's exactly what I'm trying to achieve. Ideally everyone would take care
about the build but that doesn't work. Right now the goal is to create a
task force to stabilize the build again, ie make sure it's stable without
any false positive. Again the goal of the Build Manager is NOT to fix all
issues himself/herself but to ping others to help/fix the issues and in
general to increase the awareness of having a stable build.
So please guys, takes your responsibility without
a need for a build
policeman.
It hasn't worked. Hence this new proposal. At some point we may find that
the build manager has nothing to do anymore which will be great. When that
happens we'll be able to remove that role.
Sorry, but I am -1 to do the policeman (but if I
need to, I will do my
duty), and I vote -0, just because I do not consider myself active
enough to veto.
Let's hope I've provided more info on why we need a build manager for the
time being ;)
Thanks
-Vincent
Denis
On Thursday, June 9, 2011, Thomas Mortagne <thomas.mortagne(a)xwiki.com>
wrote:
> On Thu, Jun 9, 2011 at 09:20, Vincent Massol <vincent(a)massol.net>
wrote:
>>
>> On Jun 9, 2011, at 9:08 AM, Thomas Mortagne wrote:
>>
>>> On Wed, Jun 8, 2011 at 19:40, Vincent Massol <vincent(a)massol.net>
wrote:
>>> Hi committers,
>>>
>>> We're having a hard time stabilizing our build (especially the
functional test part, see my previous mail entitled "[VOTE] Important:
Strategy to fix failing tests and stability"). Now I believe that it's
going
to be hard to enforce it and thus I'd like to
propose a variation:
>>>
>>> * The Build Manager has the *responsibility* to get the build fixed
ASAP whenever it's failing. His priority #1 during the week becomes
monitoring the Build
>>>> * By "Build" we mean the CI Build on
ci.xwiki.org and by
"failing"
we
mean anything that makes the build fail: tests,
compilation, clirr, etc.
>>> * Every week we have a different
Build Manager chosen amongst the
Committers
>>
>> A week seems a bit short but in the other hand it will seems pretty
>> long for the Build Manager itself I'm sure ;)
>>
>>> * In order to fix build issues the Build Manager has several
possibilities:
>>>> - find out who caused the build to break and ask that person to fix
it.
That person cannot refuse that and must consider
it his/her priority to
fix
it (or rollback the change that caused the build
to fail)
>>> - rollback the issue that caused the
build to fail
>>> - fix it himself/herself
>>> - find someone knowledgable in the failing domain and get him/her to
fix the build.
>>> * At the end of the Week the Build
Manager hands over his duty to the
next Build Manager by contacting him/her.
>>> * We create a Build Manager Roster
page on
dev.xwiki.org to log past
Build Managers (and possibly future ones if some
have expressed the wish
to
be the Build Manager for a specific week).
>>> * All committers must perform this
duty and take turns
>>>
>>> Since I've started doing this this week, I propose to take this role
for the current week. I'm also proposing to log Caleb has having been the
Build Manager for the past week since he's done a lot to stabilize the
build.
>>>
>>> If the vote is passed I'll log this on the Committership page as a
Committer duty (I'll also cross reference it from the Build page).
>>>
>>> Here's my +1
>>
>> +1
>>
>> What don't you think about designed people who broke the build the
>> most for the following week ?
>
> An interesting idea...
>
> However:
> 1) it's hard for flickering tests to find out the culprit
> 2) it's not so much a problem of breaking the build often, it's more a
problem of not fixing it immediately when broken
>
> Sure, my really proposal was actually "design the most painful people
> for Build Manager as build manager" but I wanted to find a better
> metric :)
>
>>
>> However I agree that in the Roster we could log information for the
past
week about who broke the build, how many flicker
fixed, etc
>>
>> Thanks
>> -Vincent
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs