On Apr 2, 2011, at 7:31 PM, Sergiu Dumitriu wrote:
Hi devs,
Since we're moving to git / GitHub, it's time to re-evaluate the
development / git usage strategy.
= Short version:
1/ Where do developers commit:
A. Always in the master branch
B. In feature branches in the official repo, merged into the master
branch when ready
C. In their personal forks, requesting pulls in the official repo
with a
mandatory code review from another developer
I've heard people do C but I don't like too much for the following
reasons:
* It means finding a developer to review every time which IMO is going
to either be very hard to do or that developer will simply approve it
without really reviewing the code
* It means the master branch will be slow to have the latest changes
which means our CI tests won't run as often on them (only close to
release dates) which is bad and will make code integration harder
* Side note: Even though Git encourages developers to keep their work
locally without sharing them we should **NOT** do this and developers
should continue to commit every day and not keep code on their
machines. This prevents code integration and code reviews. It's easy
to casually review 15 lines of code, it's very hard to review 300
lines of code!
* Since mails (and diff mails) will continue to be sent code reviews
will still be done as before. This is lazy code reviewing and is IMO
much better than forced code reviews which I don't see working. It
would be a good strategy if we were building a software to send man to
Mars for example but for XWiki I favor a lazy code reviewing approach
(what we've been doing).
2/ How to move code from development to release:
A. Commit and release from the master branch
B. Develop in feature branches, merge them in the master when
ready,
release from master
C. Develop in feature branches, merge them in a development branch
(master) for polishing, merge them in a release branch when
done-done
D. Develop in feature branches, merge them often in the development
branch (master) for snapshot testing, move into the stabilization
(pre-release) branch for polishing, move into the release branch
when
done-done.
E. Develop in feature branches, merge them in a development branch
for
snapshot testing, move into a stabilization branch for the Next
release,
which becomes the Current release branch after the previous release
is
done, and which becomes a Maintenance branch after the new release
is
performed. Only bugfixes go in the Current release branch.
I vote 1C and 2D.
I vote for doing
http://nvie.com/posts/a-successful-git-branching-model/
= Long version
The common practice with Subversion is to have as few branches as
possible, usually a trunk and a few maintenance branches, or
development+stable+maintenance. This is a consequence of the
perceived
difficulty of merging changes between branches in svn, and the high
cost
of keeping multiple branches checked out.
On the other hand, the git philosophy is to use branches as much as
possible. Two core elements are "feature branches" and "forks".
A feature branch is a branch where one feature is being developed,
separated from the trunk and all the other features. While working
on
it, the developer "rebases" the branch on top of the trunk to keep
his
branch up to date with the trunk, and at the end "merges" the
feature
branch into the trunk. This way in-development features are kept
out of
the main trunk, but still allowing changes to be committed
someplace
public (no local uncommitted code anymore).
A central element of GitHub is the ability to "fork" a repository.
This
means that a user clones a project in a personal repository where
he can
commit changes. He can later ask the maintainer of the original to
"pull" those changes back into the original repository. This is the
preferred way of contributing patches on GitHub.
== Commit/Development-related strategies
A. One central repository, one trunk (subversion-like)
Developers clone the official trunks repository, prepare commits
locally, then push back to the official repository. It's the same
strategy that we're using now, except that we can also have an
offline
local repository.
B. One central repository, feature branches
Developers clone the official trunks repository, prepare commits
locally
in feature branches, then push back to the official repository in
feature branches as well. When a feature is considered stable, it
is
merged into the master branch. Small bugfixes and improvements go
directly in the master branch.
B1. Also use specific helper branches
Security fixes also go into a "security" branch so that users can
cherry-pick them into older tags to build a custom patched version.
Retired features can go into a "retired" branch so that users can
re-include that feature in a custom build if they need it.
C. One aggregated repository, pulling from developers
Developers fork the official repositories, work on their fork (in
feature branches as well), then make pull requests for integrating
their
work into the trunk. The rule would be that another developer has
to do
the pull after a code review (mandatory code reviews). This means
more
bugs spotted before committing, but also more work/time needed from
the
committers.
We can relax the rule so that obvious bugfixes can be pulled by the
same
developer making the pull request.
Personally, I prefer C, since it ensures better quality since at
least
two eyes see each line of code.
I don't like it much because it means less integration and integration
at the last minute. I prefer 15 pair of eyes than 2 pair of eyes.
I like the idea of having a "develop" branch which is for the future
release where all do developer push to it. They can have temporary
feature branches but these need to be as temporary as possible and
push asap to the "develop" branch. The idea is that of CI which has to
be done ASAP. This is really really key.
== Integration/Release-related strategies
Currently, we're developing on the trunk, and we're releasing from
it
during short breaks from live development.
Except for RCs
This is highly dangerous, and
imposes a certain rhythm, with fast bursts of development right
after a
release, and imposed slowdown as the next release approaches (no
work on
new features after the last milestone).
Actually it's highly dangerous only if developers don't provide tests
(which is what is happening right now in lots of cases for different
reasons). This is what we need to fight. Providing isolation has never
been a solution. It's quite the opposite. It's bad and cost way more
when you need to integrate your work.
It's funny how with git we seem to forget all the development best
practices we've learnt over the years. I find myself going back to
2000, that was the last time I've had this kind of discussion ;)
Short releases from a development branch is inline with agile
development, but personally I find it too dangerous.
Most big projects always keep the main development at least one
branch
away from the release branch.
One example is the Linux kernel. While a kernel release lasts about
3
months, like our own releases, almost all of the code that goes
into a
release has been developed before the merge window opens. This
means
that after a kernel version is released, Linus opens a two-weeks
merge
window during which he accepts pull requests for existing, working,
complete code. The next ~10 weeks are spent testing the new kernel
and
integrating bugfixes, while developers prepare the features for the
next
kernel version. This ensures that a released kernel has as few bugs
as
possible. They can afford to do that since there are hundreds or
thousands of contributors. Still, this is entirely opposite to our
way
of working: after a release we barely start writing the code to go
in
the new release, and we get code in at the last minute (especially
me).
I definitely don't like to be compared to the linux kernel and
wouldn't like to be like them. I also think the comparison is not
correct.
I really dislike this way of working, pushing integration to the end.
I'm against working in this manner.
Another example that I'd like to present is the new proposed
strategy
for Mozilla Firefox:
http://mozilla.github.com/process-releases/draft/development_overvi
ew/
Basically, the propose using 4 branches, from development to
release,
where code enters on the lowest branch, and moves up towards a
release
as it stabilizes and becomes release-ready. They use 6-weeks
release
cycles, and only stable-enough features get promoted from one
branch to
the next when a new cycle starts. This process ensures quality as
well.
IMO we only need 1 develop branch + very temporary feature branches
(only when needed for complex features).
What we do need to work on is more testable code and more tests.
I'd like to move closer to one of these two strategies, so that our
releases are more polished. The mechanism for ensuring quality that
we're currently using is to have an "investigation" phase during
the
previous release, which is supposed to help define the exact goals,
so
that during the current release the development should go smoothly
towards that "idea goal". Unfortunately, this doesn't work that
well.
Without the code in place, investigations may miss important
details/limitations that will shift the development in another
direction. Or it can happen that the time is too short to fully
implement something, so we can either release a very "in progress"
feature, or decide near the end that it's not enough time to
implement
everything and focus on polishing what's already available to have
a
"partial" feature, but polished enough not to reek of low quality.
The main problem here is that we're mixing feature- and time-based
releases, with mandatory features that must find their way into a
release, and a fixed deadline to make the release. This means that
features have 8-10 weeks to be fully implemented, polished, tested,
validated. And that doesn't always happen.
So, here are some possible integration strategies:
A. Master development (like now)
All development is done in the master branch, from which we branch
a few
hours/days before the release, so that the master remains clear for
development.
B. Feature branches
All new feature development is done in a separate branch for each
feature, and we merge it in the trunk once it's considered done (or
very
close). When a release date comes, we release with the completed
features, whatever those are. We don't force a merge of an
incomplete
feature just because it's in the roadmap if it's not stable enough.
C. Feature+Development+Release branches
All development is done in feature branches, but they get merged on
the
master branch more often to have test builds; the release branch is
separate and it integrates features when they are considered ready.
This
has the advantage over B. that automated builds expose all the
development features.
D. Feature+Development+Stable+Release branches
This is similar to the new Mozilla strategy. Developers merge their
work
in the Development branch very often. Users and other developers
can
contribute here as well, and preview the upcoming features. When
they
are close to finalization, they are also merged to the stable
branch,
where UX, QA and feature owners can test and improve the feature,
preparing it for release. Once it's considered ready, it is merged
into
the release branch, where QA does a final thorough test. Releases
happen
from this branch.
E. Feature+Development+Next+Release
This is similar to D, with the exception that done features go into
the
next release, while the current release is staging. When the
release is
done, Release moves into Bugfix, Next becomes Release, and we
create a
new Next branch and start pulling in it. This would work well if we
had
very short release cycles (2 weeks), but it's not worth the effort
for
our current 3-month releases, since a feature would stagnate too
much
before being released. And it would also work if we had more beta
testers.
We can also impose windows, like 2-4 weeks for a feature to move
into
the next branch.
We could also make faster releases, skipping milestones. and going
to 6
week releases.
This means that it would take longer for a feature to make its way
from
idea to release. One release for investigation, one or more for the
main
development, and one for integration and stabilization.
But this also means that releases will be more solid, polished,
with
less bugs, and closer to the user needs.
Personally I prefer option D, although it's a bit too much overkill
with
our current limited manpower. We need more contributors and
committers!
This I agree :)
As for a change strategy, we can continue the way we're doing now,
gradually switching to feature branches and release/pre-release
branches
during the following release.
http://nvie.com/posts/a-successful-git-branching-model/ sounds like a
reasonable, in-between, strategy.
WDYT?
Thanks
-Vincent
PS: It seems my worry about using git is starting to materialize: it's
that developers develop stuff on their own repo locally without
pushing fast enough every day (several times per day), thus delaying
integration to the last moment. If this happens then the whole git
move would have been very bad for xwiki. I hope we're not going there.