On Mar 12, 2012, at 10:49 AM, Vincent Massol wrote:
I'm not going to reply point by point below but
here are some stuff I need to clear before I can change my vote (btw we need more votes on
this thread too):
* I need to check how I would fix my stat tool to exclude duplicates. I want to be
confident on this. I'd also like to talk to Ohloh to see if they plan to do this and
if not, why not.
Just to be clear, this stat tool is supposed to be what we're going to display on
xwiki.org and I'd rather have it show correct stats.
Thanks
-Vincent
* I'd like to check what other projects do on this
topic because Git is clearly not made to copy history. It's so hard to do that there
must be some reason.
Thanks
-Vincent
On Mar 9, 2012, at 11:51 PM, Denis Gervalle wrote:
> Le 9 mars 2012 16:59, "Vincent Massol" <vincent(a)massol.net> a écrit
:
>>
>>
>> On Mar 2, 2012, at 10:06 AM, Denis Gervalle wrote:
>>
>>> On Wed, Feb 29, 2012 at 08:19, Vincent Massol <vincent(a)massol.net>
> wrote:
>>>
>>>> Hi,
>>>>
>>>> On Feb 28, 2012, at 12:17 PM, Thomas Mortagne wrote:
>>>>
>>>>> Hi devs,
>>>>>
>>>>> Since I plan to move some stuff from platform to commons I would
like
>>>>> to know what you think of the history in this case.
>>>>>
>>>>> Pros including history:
>>>>> * can access easily the whole history of a moved file.
>>>>
>>>
>>> This is really an important matter, especially for those joining the
>>> project. When you follow XWiki from "outside", and not in a
continuous
>>> manner, the history is of great value to understand why stuffs are like
>>> they are, and what you may do, or not when moving forward.
>>
>> The history is not lost. If you do a join (all active repos) you still
> have it.
>
> I do not know what you means by joining all repos, but I would be surprise
> to see the IDE find its way between them. I even wonder how it could be
> possible.
>
>>
>>>> But sometimes
>>>>> changing packages etc make too much difference for git to see
it's
>>>>> actually the same file so you loose it anyway.
>>>>
>>>
>>> If you simply change the package name, and nothing else, it is really
>>> unlikely to happen.
>>>
>>>
>>>>>
>>>>> Cons including history:
>>>>> * double the history which make tools like ohloh indicate wrong
>>>> informations
>>>>
>>>
>>> Sure, the stats will be broken, but what is the matter. This is not
>>> cheating, just a misfeature in Ohloh, since the commit are just
> identical,
>>> something they may notice. IMO, this is the matter of the statistical
> tools
>>> to improve that.
>>
>> Can you tell me how to implement this because right now my GitHub tool
> doesn't do that and I don't know how to do it?
>
> If I had to implement it, I will probably use some hashing method to be
> able to recognize similar commits, since there effectively no link between
> them. But my main remarks that the statistics are broken, not the way we
> use git.
>
>>
>>>>> * it's a lot easier to move without history
>>>>
>>>
>>> There should be some tools to improve that point or we may write one,
> once
>>> for all. So this is not a real cons either.
>>
>> It's really hard to copy history in Git. It's almost impossible to do it
> right. You have to remember the full history and it's just too hard.
>
> I would be really disappointed to have to conclude that. There is probably
> some edge cases, but most of the time there is clever work around. You have
> to talk to Sergiu :-)
>
>>
>>>>> WDYT ?
>>>>>
>>>>> Even if it was looking a bit weird to me at first I'm actually +1
to
>>>>> not move the history in this case.
>>>>
>>>> +1, FTR I'd be -0, close to -1 to move it. If/when the source
> repository
>>>> is removed for one reason or another, then we might want to import its
>>>> history somewhere.
>>>>
>>>
>>> Seems we are really opposite on this one, since I am close to -1 to not
>>> move it.
>>
>> Sorry but that's the current practice :) It's also the easiest one.
>
> Until we have Git, there were no better way. This does not means that we
> should not improve our practice. By the way, it was not my thread, if
> Thomas has asked, it means that the current practice was not so current.
>
>>
>>> Statistics is really less valuable IMO, it is a small interest compare
> to
>>> code history, that I have use a lot, especially when I have join the
>>> project and follow sparingly.
>>
>> I can say exactly the same thing as you said above. It's just a question
> of tools since the history is not lost. It's still there in our active
> repos.
>
> There is absolutely no link between these histories. It is not only a
> question of tools. Moreover, requiring querying all active repositories to
> have a proper history completely defeat the purpose of having separate
> repositories.
>
> I do not see the comparison with my remark above. Git has been made for
> versionning, not for statistics, it is not my fault.
>
>>
>>> So the general rule for me is: Copy history when the source repository
> is
>>>> removed/deleted/not used anymore.
>>
>> How many times have you done this? I believe 0 times since I don't think
> you'd be so much in favor if you had tried it. I suggest you try it a few
> times on your own projects first :) It's really hard to do it right and
> very time consuming.
>
> When I have copied the security component from contrib, I have done so. I
> hope that I am not alone. And, frankly, it was not so hard, compare to the
> advantage you have.
>
>>
>>> You never know what will happen to a repository in the future, so this
>>> rules is somewhat a hope on the future, no more. And remembering that we
>>> may loose history if we do some change in the old repository, is for me
>>> like hoping you will remember my birthday ;)
>>
>> I don't agree with this at all. Again we're not loosing history. If a
> repo is removed then its history is copied I agree about that.
>
> I would like to know how you do that after the facts?
>
>>
>>>>> Eduard was proposing to include in the first commit of the new
>>>>> repository the id of the last commit containing the files (basically
>>>>> the id of the parent of the commit deleting the files) in the old
>>>>> repository so that it's easier to find it. I'm +1 for this.
>>>>
>>>
>>> But you loose all the benefits of the IDE tools that brings history of a
>>> selection automatically and that are really useful.
>>
>> A huge majority of xwiki's history is already lost to IDEs (when we moved
> from SVN) even though the SVN history was moved. Even Git itself doesn't
> follow the history when you move stuff around. Said differently it's alwasy
> possible to find the history but the IDE and "standard" tool don't
follow
> it.
>
> It does so far better since we move to Git and it is really a valuable
> tool. Do you means that because in a few case, the history may be broken,
> that we should not try to have it as complete as possible?
>
>>
>>> Moreover, if the history is rewritten due to a change in structure
> later,
>>> the hash may be broken.
>>
>> Not sure I understand this one.
>
> In Git, nothing is fully permanent, that is all I say.
>
>>
>> You should really measure the cost of what you propose Denis. It's really
> hard to do.
>
> Prove me that is more cost than the one newcommers has to enter the
> project. Maybe you do not value history so much because you have by your
> own experience of the project a good knowledge of what happen in the past.
> When I dig in some code, I always found history valuable to understand why
> that piece of code is not written the way I may have expected and why I
> should not got that way.
>
> If Thomas conclude it is too hard to be done, and not just some developer's
> lazyness, I would understand; but I do not agree that it should not be done
> just because it breaks statistics or we think it is too hard. This is why I
> suggest a tools that do it once for all. I would be really disappointed of
> Git if we had to conclude this.
>
> Thanks,
>
> Denis
>
>>
>> Thanks
>> -Vincent
>>
>>> So having a broken history is hardening the task of those who want to
>>> participate. A great value compare to the statistics IMO.