Hello,
My ODT (Open Office) files attachments are detected like "zip bomb" files by
tika and i think theses file are not proprely indexed by solr...
Do you know a way to exclude, odt files from tika' zip bomb check?
Thxs.
Here my log for odt files:
2016-11-14 11:33:04,474 [XWiki Solr index thread] ERROR .DocumentSolrMetadataExtractor -
Failed to retrieve the content of attachment [Attachment xwiki:MyFile.odt]
org.apache.tika.exception.TikaException: Zip bomb detected!
at
org.apache.tika.sax.SecureContentHandler.throwIfCauseOf(SecureContentHandler.java:192)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:123)
at org.apache.tika.Tika.parseToString(Tika.java:527)
...
at
org.xwiki.search.solr.internal.DefaultSolrIndexer.run(DefaultSolrIndexer.java:377)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tika.sax.SecureContentHandler$SecureSAXException: Suspected zip
bomb: 100 levels of XML element nesting
at
org.apache.tika.sax.SecureContentHandler.startElement(SecureContentHandler.java:234)
...
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
... 11 common frames omitted
Pascal B