Hi.
Ludovic Dubost wrote:
I think we don't need history to be compatible but
we need a migration
path (a script to migrate the previous history).
My current implementation is
migrateable via package plugin.
I'm more and more thinking we should get rid of
RCS as the versioning
system.
I'm too. JRCS is not extensible and there are no real alternatives.
In my code I tried whenever possible get rid of dependence from JRCS. So
it is easy to replace JRCS with something else.
I used mainly jrcs.diff. jrcs.rcs is used only by package plugin
([de]serialization all archive to/from string) for compatibility.
In the P2P XWiki Project we have been talking about
implementing
a "XWiki Patch" notion because we need it to send it over the P2P
network for replication. This "XWiki Patch" could be the new minimal set
of information we need for a version.
Now I think we also need a table of versions to hold some key meta data
directly available (not as diff) so that we can display it in the
history page quickly.
I store version, date, comment and author in history table
(xwikircs,
XWikiRCSNodeInfo), so history page (?viewer=history) is loading without
load any diffs (nodes content).
We could decide to store either the patch (less
space) or the full XML version in this table (more space but very safe
and faster).
2) Fetching
strategy.
Now I load all version infos at once and version contents (diff) one by
one demand (fetching strategy #2).
I see following possible fetching strategies for history storage:
1. Load all content at once
This is bad as old history storage
Currently we have a lazy fetching strategy
already except when we need a
specific version we need to load the full RCS file to be able to
retrieve it.
Yes. Others strategies cache is lazy^2 :)
And they load only necessary content.
> 2. Load one content by demand and cache
(RCSNodeInfo contains
> softreference to RCSNodeContent)
> (code: foreach needed versions do getContent(context) )
> - Many sql requests for first time.
>
> 3. Load list of the needed content per request
> (hql: from NodeContent where version>=1.2)
> One sql request per http request but always.
>
> 4. Cache list of latest nodes (from some node to latest node). Make
> only needed requests and recache.
> (cache = softref to SortedMap<version, RCSNodeContent>,
> If not finded in cache - fetch by hql (where version>=1.2 and
> version<=2.3) )
> I think it is the best fetching strategy concerning performance.
>
> 5. Something else?
>
> What fetching strategy is best for history storage?
We could decide to store the full document every 10
versions and store
only the patch (RCS or new XWiki Patch) for each intermediary version..
This would mean that to retrieve any version you need one full version +
10 nodes..
I will try to implement this now.
Implementation thoughts:
onsave: If (count % 50 == 0) save full version
onload: load nearest full version (by hql), or latest node if not finded.
It would be great to work on the new "XWiki
Patch" system since it is
needed for the P2P. What we discussed at the meeting was a language like:
ins(content,6,'Hello') = insert in field 'content' at char 6 the
text 'Hello'
del(content,6,5) = delete 5 char from field content starting at char 6
set(author,'XWiki.LudovicDubost' = set author field to XWiki.LudovicDubost
setObjectProperty('XWiki.ArticleClass',0,'propname','propvalue')
insObjectProperty('XWiki.ArticleClass',0,'propname',6,'propvalue')
Great. I will try to find some time to implement this, but not now.
--
Artem Melentyev