Hi,
today I've noticed that something bad had happen to some of the attachments in my
XWiki, here is a
screenshot from one of the affected pages:
http://i.imgur.com/p6Xs7.png
Take a look, a couple of attachments have been uploaded but only one is displayed in the
attachment tab.
Person who uploaded them claims that yesterday they were ok, but today somehow they
disappeared.
It's weird that there is no trace of any operation on them after the uploading
phase.
I'm using XWiki Enterprise 2.5.32127 with MySQL data base (Server version 5.1.47).
To add more context, last days my users started to add more attachements to their pages.
Currently the
database after the dump is around 200 MB large.
Also looked at the logs and found several interesting fragments ( all of the log snippets
are from the time
this have been noticed ):
2010-11-18 09:03:09,355
[
http://apps.man.poznan.pl:28181/xwiki/bin/download/Documents/Proposals/2009…]
ERROR web.XWikiAction - Connection aborted
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
2010-11-18 13:23:53,118 [
http://localhost:28181/xwiki/bin/view/Projects/Opinion+Mining]
WARN
xwiki.MyPersistentLoginManager - Login cookie validation hash mismatch! Cookies have
been tampered with
2010-11-18 13:23:53,119 [
http://localhost:28181/xwiki/bin/view/Projects/Opinion+Mining]
WARN
xwiki.MyPersistentLoginManager - Login cookie validation hash mismatch! Cookies have
been tampered with
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
2010-11-18 13:57:55,471 [Lucene Index Updater] WARN lucene.AttachmentData -
error getting content
of attachment [2009BEinGRIDwow2greenCONTEXTREVIEW.PPT] for document
[xwiki:Documents.Presentations]
org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from
org.apache.tika.parser.microsoft.OfficeParser@72be25d1
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:138)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
at org.apache.tika.Tika.parseToString(Tika.java:267)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:136)
at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
at com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Cannot remove block[ 4209 ]; out of range[ 0 - 3804 ]
at org.apache.poi.poifs.storage.BlockListImpl.remove(BlockListImpl.java:98)
at
org.apache.poi.poifs.storage.RawDataBlockList.remove(RawDataBlockList.java:32)
at
org.apache.poi.poifs.storage.BlockAllocationTableReader.<init>(BlockAllocationTableReader.java:99)
at
org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:164)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:74)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
... 13 more
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
4006
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
4006
2010-11-18 15:05:10,412
[
http://apps.man.poznan.pl:28181/xwiki/bin/download/Documents/Presentations/…]
ERROR web.XWikiAction - Connection aborted
Unfotunately, today this situation has repeated with other group of users, the same
scenario - after the
attachment submission and few edits of the page, they are gone. A snippet from the log
from that period of
time ( a lot of that warnings ):
2010-11-19 10:43:37,199 [Lucene Index Updater] WARN util.PDFStreamEngine -
java.io.IOException:
Error: expected hex character and not :32
java.io.IOException: Error: expected hex character and not :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:74)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:79)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
at org.apache.tika.Tika.parseToString(Tika.java:267)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:136)
at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
at com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
at java.lang.Thread.run(Thread.java:662)
One more from another user:
2010-11-19 10:43:37,464 [Lucene Index Updater] WARN util.PDFStreamEngine -
java.io.IOException:
Error: expected hex character and not :32
java.io.IOException: Error: expected hex character and not :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at
org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:61)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:74)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:79)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
at org.apache.tika.Tika.parseToString(Tika.java:267)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:142)
at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
at com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
at java.lang.Thread.run(Thread.java:662)
2010-11-19 11:32:39,900 [Lucene Index Updater] WARN lucene.AttachmentData -
error getting content
of attachment [2008BEinGRIDdesignconceptdiagramdoneinVisio.vsd] for document
[xwiki:Documents.Diagrams]
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.microsoft.OfficeParser@54ad9fa4
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:134)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
at org.apache.tika.Tika.parseToString(Tika.java:267)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:136)
at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
at com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.IllegalArgumentException: Found a chunk with a negative length,
which isn't allowed
at org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:120)
at org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:59)
at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:93)
at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100)
at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100)
at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:95)
at
org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52)
at
org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:49)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:127)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
... 13 more
I'm counting on your help since I don't know it's more XWiki issue or maybe I
misconfigured something.
Regards,
Piotr
_______________________________________________
users mailing list
users(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/users
I think you could be facing two kind of problems: one related with
memory availability (the one causing attachements to "dissapear") and
other one related to Lucene and some incompatibilities with Microsoft/
Microsoft Office files.
Concerning the problem related with memory availability, please, check
this two links: