Re: [xwiki-dev] [Proposal] Document history storage

25 Jul 2007

Hi.

Ludovic Dubost wrote:
...
  I think we don't need history to be compatible but
we need a migration 
 path (a script to migrate the previous history). My current implementation is
migrateable via package plugin.

...
  I'm more and more thinking we should get rid of
RCS as the versioning 
 system. I'm too. JRCS is not extensible and there are no real alternatives.
In my code I tried whenever possible get rid of dependence from JRCS. So 
it is easy to replace JRCS with something else.
I used mainly jrcs.diff. jrcs.rcs is used only by package plugin 
([de]serialization all archive to/from string) for compatibility.

...
  In the P2P XWiki Project we have been talking about
implementing 
 a "XWiki Patch" notion because we need it to send it over the P2P 
 network for replication. This "XWiki Patch" could be the new minimal set 
 of information we need for a version.

 Now I think we also need a table of versions to hold some key meta data 
 directly available (not as diff) so that we can display it in the 
 history page quickly. I store version, date, comment and author in history table
(xwikircs, 
XWikiRCSNodeInfo), so history page (?viewer=history) is loading  without 
load any diffs (nodes content).

...
  We could decide to store either the patch (less 
 space) or the full XML version in this table (more space but very safe 
 and faster). 

...
   2) Fetching
strategy.

 Now I load all version infos at once and version contents (diff) one by
 one demand (fetching strategy #2).

 I see following possible fetching strategies for history storage:

 1. Load all content at once
  This is bad as old history storage  Currently we have a lazy fetching strategy
already except when we need a 
 specific version we need to load the full RCS file to be able to 
 retrieve it. 
Yes. Others strategies cache is lazy^2 :)
And they load only necessary content.

...
 > 2. Load one content by demand and cache
(RCSNodeInfo contains 
> softreference to RCSNodeContent)
>  (code: foreach needed versions do getContent(context) )
>  - Many sql requests for first time.
>
> 3. Load list of the needed content per request
>  (hql: from NodeContent where version>=1.2)
>  One sql request per http request but always.
>
> 4. Cache list of latest nodes (from some node to latest node). Make 
> only needed requests and recache.
>  (cache = softref to SortedMap<version, RCSNodeContent>,
>  If not finded in cache - fetch by hql (where version>=1.2 and 
> version<=2.3) )
>  I think it is the best fetching strategy concerning performance.
>
> 5. Something else?
>
> What fetching strategy is best for history storage? 
...
  We could decide to store the full document every 10
versions and store 
 only the patch (RCS or new XWiki Patch) for each intermediary version..
 This would mean that to retrieve any version you need one full version + 
 10 nodes.. I will try to implement this now.
Implementation thoughts:
onsave: If (count % 50 == 0) save full version
onload: load nearest full version (by hql), or latest node if not finded.

...
  It would be great to work on the new "XWiki
Patch" system since it is 
 needed for the P2P. What we discussed at the meeting was a language like:

 ins(content,6,'Hello')   =  insert in field 'content'  at char 6 the 
 text 'Hello'
 del(content,6,5) = delete 5 char from field content starting at char 6
 set(author,'XWiki.LudovicDubost' = set author field to XWiki.LudovicDubost
 setObjectProperty('XWiki.ArticleClass',0,'propname','propvalue')

insObjectProperty('XWiki.ArticleClass',0,'propname',6,'propvalue')

Great. I will try to find some time to implement this, but not now.

-- 
   Artem Melentyev

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-dev] [Proposal] Document history storage