On Mon, Jan 9, 2012 at 11:44, Vincent Massol <vincent(a)massol.net> wrote:
On Jan 9, 2012, at 11:36 AM, Denis Gervalle wrote:
On Mon, Jan 9, 2012 at 11:23, Vincent Massol
<vincent(a)massol.net> wrote:
>
> On Jan 9, 2012, at 11:09 AM, Denis Gervalle wrote:
>
>> On Mon, Jan 9, 2012 at 10:07, Vincent Massol <vincent(a)massol.net>
wrote:
>>
>>> +1 with the following caveats:
>>>
>>> * We need to guarantee that a migration cannot corrupt the DB.
>>
>>
>> The evolution of the migration was the first steps in that procedure,
> since
>> accessing a DB with an inappropriate XWiki core could have corrupt it.
>>
>>
>>> For example imagine that we change a document id but this id is also
> used
>>> in some other tables and the migration stops before it's changed in
the
>>> other tables. The change needs to be
done in transactions for each doc
>>> being changed across all tables.
>>
>>
>> That would be nice, but MySQL does not support transaction on ISAM
table.
>> I use a single transaction for the whole
migration process,
>
> I think we should have one transaction per document update instead.
We've
> had this problem in the past when upgrading
very large systems. The
> migration was never going through in one go for some reason which I have
> forgotten so we had needed to use several tx so that the migrations
could
be
restarted when it failed and so that it could complete.
This could be done easily if you want it so. Just note that all other
migration are single transaction based AFAICS.
I'm pretty sure this isn't the case.
See R4359XWIKI1459DataMigration and R6079XWIKI1878DataMigration for
example.
for executing some stuffs, I really do not see a partial
commit there
between "rows".
Thanks
-Vincent
so on systems
that support it (Oracle ?), there will be migration or not. But I could
not
secure MySQL better that it is possible to.
It should work fine on MySQL with InnoDB which recommend (see
http://platform.xwiki.org/xwiki/bin/view/AdminGuide/InstallationMySQL).
I am myself on MyISAM since long, since there is other drawback using
InnoDB.
I do not experience much issue with corruption up to now. So you could
expect other to have similar setup.
>
> Thanks
> -Vincent
>
>>> Said differently the migrator should be allowed to be ctrl-c-ed at any
>>> time and you safely restart xwiki and the migrator will just carry on
> from
>>> where it was.
>>>
>>
>> The migrator will restart were it left-off, but the granularity is the
>> document. I proceed the updates by documents, updating all tables for
> each
>> one. If there is some issue during the migration let say on MySQL, and
it
>> is restarted, it will start again
skipping documents that have been
>> converted previously. So the corruption could be limited to a single
>> document.
>>
>>
>>> * OR we need to have a configuration parameter for deciding to run
this
>>> migration or not so that users run it
only when they decide thus
> ensuring
>>> that they've done the proper backups and saving of DBs.
>>>
>>
>> This is true using the new migration procedure, but not as flexible as
> you
>> seems to expect. Supporting two hashing algorithm is not a feature, but
> an
>> augmented risk of causing corruption for me.
>> Now, if you use a recent core, that use new id, and on the other side,
> you
>> have not activated migrations and access an old db, you will simply be
>> unable to access the database. You will receive a "db require
migration"
>> exception.
>>
>> Anyway, migration are disable by default, and should be enabled by an
>> administrator in xwiki.cfg. The release notes will mention the needs to
>> proceed to it, and of course, to make a backup before. And you are
always
>> supposed to have backup when you upgrade,
or you are not a system admin
> ;)
>>
>>
>>> I prefer the first option but we need to guarantee it.
>>>
>>
>> We will never be able to guarantee it, but I have done my best to have
it
>> the most secure.
>>
>>
>>>
>>> Thanks
>>> -Vincent
>>>
>>> On Jan 7, 2012, at 10:39 PM, Denis Gervalle wrote:
>>>
>>>> Now that the database migration mechanism has been improved, I would
> like
>>>> to go ahead with my patch to improve document ids.
>>>>
>>>> Currently, ids are simple string hashcode of a locally serialized
>>> document
>>>> reference, including the language for translated documents. The
>>> likelihood
>>>> of having duplicates with the string hashing algorithm of java is
> really
>>>> high.
>>>>
>>>> What I propose is:
>>>>
>>>> 1) use an MD5 hashing which is particularly good at distributing.
>>>> 2) truncate the hash to the first 64bits, since the XWD_ID column is
a
>>>> 64bit long.
>>>> 3) use a better string representation as the source of hashing
>>>>
>>>> Based on previous discussion, point 1) and 2) has already been
agreed,
>>> and
>>>> this vote is in particular about the string used for 3).
>>>> I propose it in 2 steps:
>>>>
>>>> 1) before locale are fully supported in document reference, use this
>>>> format:
>>>>
>>>>
>>>
>
<lengthOfLastSpaceName>:<lastSpaceName><lengthOfDocumentName>:<documentName><lengthOfLanguage>:<language>
>>>> where language would be an empty
string for the default document, so
>>> it
>>>> would look like 7:mySpace5:myDoc0: and its french translation could
be
>>>> 7:mySpace5:myDoc2:fr
>>>> 2) when locale are included in reference, we will replace the
>>>> implementation by a reference serializer that would produce the same
> kind
>>>> of representation, but that will include all spaces (not only the
last
>>>> one), to be prepared for the
future.
>>>>
>>>> While doing so, I also propose to fix the cache key issue by using
the
>>> same
>>>> reference, but prefixed by <lengthOfWikiName>:<wikiName>, so
the
> previous
>>>> examples will have the following key in the document cache:
>>>> 5:xwiki7:mySpace5:myDoc0: and 5:xwiki7:mySpace5:myDoc2:fr
>>>>
>>>> Using such a key (compared to the usual serialization) has the
> following
>>>> advantages:
>>>> - ensure uniqueness of the reference without requiring a complex
> escaping
>>>> algorithm, which is unneeded here.
>>>> - potentially reversible
>>>> - faster than the usual serialization
>>>> - support language
>>>> - independent of the current serialization that may evolved
>>> independently,
>>>> so it will be stable over time which is really important when it is
> used
>>> as
>>>> a base for the hashing algorithm used for document ids stored in the
>>>> database.
>>>>
>>>> I would like to introduce this as early as possible, which means has
> soon
>>>> has we are confident with the migration mechanism recently
introduced.
>>>> Since the migration of ids will
convert 32bits hashes into 64bits
ones,
>>> the
>>>> risk of collision is really low, and to be careful, I have written a
>>>> migration algorithm that would support such collision (unless it
cause
> a
>>>> circular reference collision, but this is really unexpected).
However,
>>>> changing ids again later, if we
change our mind, will be really more
>>> risky
>>>> and the migration difficult to implements, so it is really important
> that
>>>> we agree on the way we compute these ids, once for all.
>>>>
>>>> Here is my +1,
>>>>
>>>> --
>>>> Denis Gervalle
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs