Hi Ricardo,
W dniu 10-11-19 19:37, Ricardo Rodriguez [eBioTIC.] wrote:
Hi Piotr,
Piotr Dziubecki wrote:
> Hi,
>
> today I've noticed that something bad had happen to some of the attachments in my
XWiki, here is a
> screenshot from one of the affected pages:
>
>
http://i.imgur.com/p6Xs7.png
>
> Take a look, a couple of attachments have been uploaded but only one is displayed in
the attachment tab.
> Person who uploaded them claims that yesterday they were ok, but today somehow they
disappeared.
>
> It's weird that there is no trace of any operation on them after the uploading
phase.
>
> I'm using XWiki Enterprise 2.5.32127 with MySQL data base (Server
version 5.1.47).
>
> To add more context, last days my users started to add more attachements to their
pages. Currently the
> database after the dump is around 200 MB large.
>
> Also looked at the logs and found several interesting fragments ( all of the log
snippets are from the time
> this have been noticed ):
>
> 2010-11-18 09:03:09,355
>
[
http://apps.man.poznan.pl:28181/xwiki/bin/download/Documents/Proposals/2009…]
> ERROR web.XWikiAction - Connection aborted
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> 2010-11-18 13:23:53,118
[
http://localhost:28181/xwiki/bin/view/Projects/Opinion+Mining] WARN
> xwiki.MyPersistentLoginManager - Login cookie validation hash mismatch! Cookies have
been tampered with
> 2010-11-18 13:23:53,119
[
http://localhost:28181/xwiki/bin/view/Projects/Opinion+Mining] WARN
> xwiki.MyPersistentLoginManager - Login cookie validation hash mismatch! Cookies have
been tampered with
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> 2010-11-18 13:57:55,471 [Lucene Index Updater] WARN lucene.AttachmentData
- error getting content
> of attachment [2009BEinGRIDwow2greenCONTEXTREVIEW.PPT] for document
[xwiki:Documents.Presentations]
> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from
> org.apache.tika.parser.microsoft.OfficeParser@72be25d1
> at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:138)
> at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
> at org.apache.tika.Tika.parseToString(Tika.java:267)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:136)
> at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
> at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
> at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Cannot remove block[ 4209 ]; out of range[ 0 - 3804
]
> at
org.apache.poi.poifs.storage.BlockListImpl.remove(BlockListImpl.java:98)
> at
org.apache.poi.poifs.storage.RawDataBlockList.remove(RawDataBlockList.java:32)
> at
org.apache.poi.poifs.storage.BlockAllocationTableReader.<init>(BlockAllocationTableReader.java:99)
> at
org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:164)
> at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:74)
> at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
> ... 13 more
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
3999
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
4006
> Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by
4006
> 2010-11-18 15:05:10,412
>
[
http://apps.man.poznan.pl:28181/xwiki/bin/download/Documents/Presentations/…]
> ERROR web.XWikiAction - Connection aborted
>
>
>
> Unfotunately, today this situation has repeated with other group of users, the same
scenario - after the
> attachment submission and few edits of the page, they are gone. A snippet from the
log from that period of
> time ( a lot of that warnings ):
>
> 2010-11-19 10:43:37,199 [Lucene Index Updater] WARN util.PDFStreamEngine
- java.io.IOException:
> Error: expected hex character and not :32
> java.io.IOException: Error: expected hex character and not :32
> at
org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
> at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
> at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
> at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
> at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
> at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
> at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
> at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
> at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:74)
> at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
> at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
> at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
> at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
> at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
> at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
> at
org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
> at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
> at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:79)
> at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
> at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
> at org.apache.tika.Tika.parseToString(Tika.java:267)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:136)
> at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
> at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
> at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
> at java.lang.Thread.run(Thread.java:662)
>
>
> One more from another user:
>
> 2010-11-19 10:43:37,464 [Lucene Index Updater] WARN util.PDFStreamEngine
- java.io.IOException:
> Error: expected hex character and not :32
> java.io.IOException: Error: expected hex character and not :32
> at
org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
> at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
> at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
> at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
> at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
> at
org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:61)
> at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
> at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
> at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:74)
> at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
> at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
> at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
> at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
> at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
> at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
> at
org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
> at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
> at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:79)
> at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
> at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
> at org.apache.tika.Tika.parseToString(Tika.java:267)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:142)
> at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
> at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
> at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
> at java.lang.Thread.run(Thread.java:662)
> 2010-11-19 11:32:39,900 [Lucene Index Updater] WARN lucene.AttachmentData
- error getting content
> of attachment [2008BEinGRIDdesignconceptdiagramdoneinVisio.vsd] for document
[xwiki:Documents.Diagrams]
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.OfficeParser@54ad9fa4
> at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:134)
> at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
> at org.apache.tika.Tika.parseToString(Tika.java:267)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getContentAsText(AttachmentData.java:161)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.getFullText(AttachmentData.java:136)
> at com.xpn.xwiki.plugin.lucene.IndexData.getFullText(IndexData.java:190)
> at
com.xpn.xwiki.plugin.lucene.IndexData.addDataToLuceneDocument(IndexData.java:146)
> at
com.xpn.xwiki.plugin.lucene.AttachmentData.addDataToLuceneDocument(AttachmentData.java:65)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.addToIndex(IndexUpdater.java:296)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.updateIndex(IndexUpdater.java:237)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runMainLoop(IndexUpdater.java:171)
> at
com.xpn.xwiki.plugin.lucene.IndexUpdater.runInternal(IndexUpdater.java:153)
> at
com.xpn.xwiki.util.AbstractXWikiRunnable.run(AbstractXWikiRunnable.java:99)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.IllegalArgumentException: Found a chunk with a negative length,
which isn't allowed
> at
org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:120)
> at
org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:59)
> at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:93)
> at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100)
> at
org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100)
> at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:95)
> at
org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52)
> at
org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:49)
> at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:127)
> at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:132)
> ... 13 more
>
>
> I'm counting on your help since I don't know it's more XWiki issue or
maybe I misconfigured something.
>
> Regards,
> Piotr
> _______________________________________________
> users mailing list
> users(a)xwiki.org
>
http://lists.xwiki.org/mailman/listinfo/users
>
>
>
I think you could be facing two kind of problems: one related with
memory availability (the one causing attachements to "dissapear") and
other one related to Lucene and some incompatibilities with Microsoft/
Microsoft Office files.
Concerning the problem related with memory availability, please, check
this two links:
http://www.xwiki.org/xwiki/bin/view/FAQ/Howtoincreasethemaximumattachmentsi…
http://www.xwiki.org/xwiki/bin/view/FAQ/HowToSolveAJavaHeapMemoryError
I've already done that - I'm storing attachments 20MB size
without any errors while uploading.
I'm not sure if this issus could lead to
corrupted attachments or only
to failures in the process. But I think it is worth to take them into
account.
What scares me is the fact that even if something went wrong I have no
visible warning or transaction's
rollback. It's ending in the middle and confuses users.