[xwiki-dev] [Proposal] Document history storage
Artem Melentyev
melenartem at ya.ru
Wed Jul 25 15:18:05 CEST 2007
Hi.
Ludovic Dubost wrote:
> I think we don't need history to be compatible but we need a migration
> path (a script to migrate the previous history).
My current implementation is migrateable via package plugin.
> I'm more and more thinking we should get rid of RCS as the versioning
> system.
I'm too. JRCS is not extensible and there are no real alternatives.
In my code I tried whenever possible get rid of dependence from JRCS. So
it is easy to replace JRCS with something else.
I used mainly jrcs.diff. jrcs.rcs is used only by package plugin
([de]serialization all archive to/from string) for compatibility.
> In the P2P XWiki Project we have been talking about implementing
> a "XWiki Patch" notion because we need it to send it over the P2P
> network for replication. This "XWiki Patch" could be the new minimal set
> of information we need for a version.
>
> Now I think we also need a table of versions to hold some key meta data
> directly available (not as diff) so that we can display it in the
> history page quickly.
I store version, date, comment and author in history table (xwikircs,
XWikiRCSNodeInfo), so history page (?viewer=history) is loading without
load any diffs (nodes content).
> We could decide to store either the patch (less
> space) or the full XML version in this table (more space but very safe
> and faster).
>> 2) Fetching strategy.
>>
>> Now I load all version infos at once and version contents (diff) one by
>> one demand (fetching strategy #2).
>>
>> I see following possible fetching strategies for history storage:
>>
>> 1. Load all content at once
>> This is bad as old history storage
> Currently we have a lazy fetching strategy already except when we need a
> specific version we need to load the full RCS file to be able to
> retrieve it.
Yes. Others strategies cache is lazy^2 :)
And they load only necessary content.
>> 2. Load one content by demand and cache (RCSNodeInfo contains
>> softreference to RCSNodeContent)
>> (code: foreach needed versions do getContent(context) )
>> - Many sql requests for first time.
>>
>> 3. Load list of the needed content per request
>> (hql: from NodeContent where version>=1.2)
>> One sql request per http request but always.
>>
>> 4. Cache list of latest nodes (from some node to latest node). Make
>> only needed requests and recache.
>> (cache = softref to SortedMap<version, RCSNodeContent>,
>> If not finded in cache - fetch by hql (where version>=1.2 and
>> version<=2.3) )
>> I think it is the best fetching strategy concerning performance.
>>
>> 5. Something else?
>>
>> What fetching strategy is best for history storage?
> We could decide to store the full document every 10 versions and store
> only the patch (RCS or new XWiki Patch) for each intermediary version..
> This would mean that to retrieve any version you need one full version +
> 10 nodes..
I will try to implement this now.
Implementation thoughts:
onsave: If (count % 50 == 0) save full version
onload: load nearest full version (by hql), or latest node if not finded.
> It would be great to work on the new "XWiki Patch" system since it is
> needed for the P2P. What we discussed at the meeting was a language like:
>
> ins(content,6,'Hello') = insert in field 'content' at char 6 the
> text 'Hello'
> del(content,6,5) = delete 5 char from field content starting at char 6
> set(author,'XWiki.LudovicDubost' = set author field to XWiki.LudovicDubost
> setObjectProperty('XWiki.ArticleClass',0,'propname','propvalue')
> insObjectProperty('XWiki.ArticleClass',0,'propname',6,'propvalue')
Great. I will try to find some time to implement this, but not now.
--
Artem Melentyev
More information about the devs
mailing list