Re: [xwiki-dev] [Proposal] Document history storage

2 Mar 2007

To clarify one misunderstanding: the attachments are not stored, just the
attachments' name and version (number). AFAIK, the attachment history is
stored separately.
I know that it is not so efficient to store the complete document, with
content and objects, even if there is a small change like a comment added.
But this is how it is done now too, and this is a change that tries to do
better, not perfect.
On 3/1/07, Ludovic Dubost &lt;ludovic(a)xwiki.com&gt; wrote:
...

 Hi,
 I think it's a good idea to have a versions table. One thing I'm not
 sure of is whether this table should hold the master information or just
 a cache for the information stored in the revision. If it is a cache it
 could not have all the info but just the most important one.
 What I'm worried about is the volume of information when there are many
 changes. Suppose we get a comment spam of 500 comments. The JRCS
 revision system will only add you the actual spam. If you have the
 archived info in the table system you get 500 times the size of the
 document. And how will you export the whole document including archives.
 Would you use RCS or would you have the whole history inside an XML
 field inside the document.
 One downside of RCS is that you need to parse the whole RCS document to
 get the version. But we could solve this by cutting the RCS file in
 chunks of 50 versions so that we get faster retrieval. It's true that
 this is a little painfull to code.
 The cache table with the most important metadata (version, date, author,
 comment) would allow to have what we need for getting information about
 contributors and number of contributions, retrieving comments at edit
 time.
 Ludovic
 Sergiu Dumitriu a écrit :
  Hi,
 Sometime ago, there was a discussion regarding how should the document
 history be stored in a better way.
 Right now, the complete history is stored as one field in the xwikidoc
 table. From my PoV, this has some major disadvantages:
 - loading an older version requires parsing all the history -> memory
 inefficiency
 - as the documents grow older, loading a document will take a lot of
 time -> time inefficiency
 - queries on archives cannot return just one version, but they match
 the whole document (somewhere in the history, there was a version
 containing "search term")
 The blocking issue with storing old version in a different table was,
 at that time, the fact that a document archive should contain all
 information needed for completely restoring the document, like
 content, metadata, objects.
 I don't think that is actually an issue. We are archiving document
 versions, but we're joining all versions in one large string. Why
 don't we archive the complete version, but one version per row?
 So, the archive table should look like:
 - document name
 - version number
 - language (for translations)
 - content
 - archived metadata (one field, or the same fields as in xwikidoc)
 - archived objects (one field)
 - attachment names and versions
 It is not like storing the version as a normal document is, with
 separate objects and properties, but at least it provides a better
 storage and retrieval mechanism, and it separates a bit the parts of a
 wikidocument - content, metadata, objects.
 WDYT?

--
http://purl.org/net/sergiu

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-dev] [Proposal] Document history storage