Hi devs,
I know this subject will seems to you already voted and discussed in
http://xwiki.markmail.org/thread/fsd25bvft74xwgcx
But following the remarks and the discussion under that thread, I had
largely improved the proposed changes.
This is an important matter, so I prefer to resume here to be sure we all
really agree on this.
To resume the current situation, we have:
- document id
- used in document table, rcs, attachment...
- simple 32bits string hashcode of a locally serialized
document reference, including the language for translated documents
- stored in a 64bits field.
- object id
- used in object and property tables, but also in statistics tables
- simple 32bits string hashcode of the concatenation of the document
reference, the class reference and the object number
- stored in a 32bits field (except in Oracle, where the mapping is
32bits, but the storage is larger)
The vote is about:
- document id
- use the lower 64bits of an MD5 hashcode
- the base key for the hashcode is serialized using a
LocalUidEntityReferenceSerializer of the document reference
- the result is appended with the current locale for translated
document, until locale are integrated in references
- format for original document: 5space8document
- format for translation: 5:space8:document2:fr
- object id
- use the lower 64bits of an MD5 hashcode
- the base key for the hashcode is serialized using a
LocalUidEntityReferenceSerializer of the BaseObjectReference
- current format would be: 5:space8:document12:xspace.class[0]
- if my proposal in the object reference thread is adopted:
5:space8:document18:6:xspace5:class[0]
Since changing document id could really not helps since document reference
are used in object ids and therefore unambiguous document could
receive ambiguous objects, I do not advice to split the change. Moreover,
this is really sensible change in the database, so not multiplying them is
better. I think the upcoming 4 is a really good time to introduce this
change, so I propose to introduce this in version 40000 of the database
(4.0M1 release).
But I would like to use it internally earlier. So you would be pleased to
settle on this thread and the previous one before.
It implies the following migration for existing instances:
- customer custom mapping have to be adapted before the migration,
including dynamic one which could be not so easy, but this is already
rarely used and very rarely require any change in fact.
- change XWikiDocument to provide the key required for IDs, by the way,
also use that key (non local version) for the document cache
- refactor the BaseElement hierarchy to provide long IDs (no more
integer) based on references (generic way to have ids for any element)
- change the hibernate mapping for all object ids
- provide dynamic schema updates using liquibase to fix all object id
types, including those in custom mapping and collection
- migrate in HQL document id for persisted
class: XWikiDocument, XWikiRCSNodeInfo, XWikiLink, XWikiAttachment,
DeletedAttachment
- migrate in HQL object id for persisted class: BaseObject, *Property,
internal custom mapped class, dynamic custom mapped class
- migrate in HQL object id for custom statistics class derived form
XWikiStats
- migrate in SQL ids for all relational collections in the above migrated
tables
To provide this migration as safely as possible:
- Liquibase provide a safe way to change the schema
- All id conversion are gathered from the database in a first single
read-only transaction, and new id are computed.
- Potentially already migrated ids are detected, allowing the process to
fails and be restarted.
- Proceed to ids replacement using a safe algorithm that may support
non-circular conflict between old and new ids (very unlikely anyway, since
we move from 32 to 64bits)
- Use a single transaction for each id conversion, replacing it in all
related tables
- Use Database independent queries (HQL) as much as possible, only bulk
update on collection which are not supported by hibernate are in a
minimalistic SQL update statement.
Some helps for testing the migration on different
environments is requested ! (I do my tests on MySQL deeply)
I will commit my branch on platform soon.
Here is my +1.
--
Denis Gervalle
SOFTEC sa - CEO
eGuilde sarl - CTO