Hi,
Sometime ago, there was a discussion regarding how should the document history be stored in a better way.
Right now, the complete history is stored as one field in the xwikidoc table. From my PoV, this has some major disadvantages:
- loading an older version requires parsing all the history -> memory inefficiency
- as the documents grow older, loading a document will take a lot of time -> time inefficiency
- queries on archives cannot return just one version, but they match the whole document (somewhere in the history, there was a version containing "search term")
The blocking issue with storing old version in a different table was, at that time, the fact that a document archive should contain all information needed for completely restoring the document, like content, metadata, objects.
I don't think that is actually an issue. We are archiving document versions, but we're joining all versions in one large string. Why don't we archive the complete version, but one version per row?
So, the archive table should look like:
- document name
- version number
- language (for translations)
- content
- archived metadata (one field, or the same fields as in xwikidoc)
- archived objects (one field)
- attachment names and versions
It is not like storing the version as a normal document is, with separate objects and properties, but at least it provides a better storage and retrieval mechanism, and it separates a bit the parts of a wikidocument - content, metadata, objects.
WDYT?
--
http://purl.org/net/sergiu