Hi,
I have some changes to the attachment system which will allow XWiki to handle much larger
attachments without memory exhaustion. I have found that there are some places where I
cannot make
any changes because the code is not in XWiki but rather in JRCS.
XWiki versions attachments by creating a JRCS node for the XML version of each version of
each
attachment. This means that memory consumption improvements hit a hard wall at 2 * 2 * 1.3
* the
size of the attachment. base-64 encoding for XML increases the size by 1.3 times, storage
as a
String (array of 16 bit chars) doubles the size and the need to copy the String doubles
the size again.
The second issue is that the database and JDBC do not handle multiple hundreds of
megabytes in a
single query well. If I try to attach a 500MB attachment with attachment versioning
disabled, my
changes allow the attachment to be streamed to the database but postgresql is not able to
save it. I
am able to attach a 256MB attachment but with 512MB of heap space, the attachment cannot
be loaded
from the database because JDBC lacks the necessary streaming functionality.
An option which I am now considering is adding a binary table to the database schema. The
table
would contain a composite id made of the id of the data and the part number of that entry,
and a
data column slightly smaller than 1MB (default max_allowed_packet in mysql). All
interaction with
this table would go through a storage engine which would require InputStreams and
OutputStreams and
the streams would be written and read by the storage mechanism which would ID tag them and
break
them up into parts to be sent to the database individually.
WDYT?
Caleb