On 10/14/2010 12:53 PM, Paul Libbrecht wrote:
Caleb,
your analysis seems to match very well what I had observed.
Does it mean that such changes would also affect the XML-serailization?
If we were to implement a "binary" database table, we would have the flexability
to decide whether
we wish to keep using the XML format or not. The XML serializer despite increasing the
size by 30%
is already well streamlined for large content. The JRCS versioning store on the other hand
is not
prepared to handle large content so with a binary database table we would have the option
of
donating a patch set to the JRCS people or choosing a different versioning system.
Caleb
paul
On 14 oct. 2010, at 15:24, Caleb James DeLisle wrote:
Hi,
I have some changes to the attachment system which will allow XWiki to handle much
larger
attachments without memory exhaustion. I have found that there are some places where I
cannot make
any changes because the code is not in XWiki but rather in JRCS.
XWiki versions attachments by creating a JRCS node for the XML version of each version of
each
attachment. This means that memory consumption improvements hit a hard wall at 2 * 2 *
1.3 * the
size of the attachment. base-64 encoding for XML increases the size by 1.3 times, storage
as a
String (array of 16 bit chars) doubles the size and the need to copy the String doubles
the size again.
The second issue is that the database and JDBC do not handle multiple hundreds of
megabytes in a
single query well. If I try to attach a 500MB attachment with attachment versioning
disabled, my
changes allow the attachment to be streamed to the database but postgresql is not able to
save it. I
am able to attach a 256MB attachment but with 512MB of heap space, the attachment cannot
be loaded
from the database because JDBC lacks the necessary streaming functionality.
An option which I am now considering is adding a binary table to the database schema. The
table
would contain a composite id made of the id of the data and the part number of that
entry, and a
data column slightly smaller than 1MB (default max_allowed_packet in mysql). All
interaction with
this table would go through a storage engine which would require InputStreams and
OutputStreams and
the streams would be written and read by the storage mechanism which would ID tag them
and break
them up into parts to be sent to the database individually.
WDYT?
Caleb
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs