Hello,
I'm using Rest API to send some attachments into my XWiki as a multipart
post (pages/xxx/attachments). Posting works fine and attachments appear and
are loadable correctly. However decoding special characters such as é in
attachment names fails. Uploading attachments via XWiki interface works
well.
For an examle:
"méthodologie" is UTF-8 URI escaped as "m%C3%A9thodologie" and then
used in
the post method. Filename appears as "méthodologie" in XWiki, which
indicates that XWiki uses different charset to do the decoding (%C3 => Ã,
%A9 => ©).
However, my XWiki is setup to use UTF-8. Forcing iso-8859-1 encoding leads
to "m%E9thodologie" and attachment names is decoded correctly. Is this
intended? How should I encode characters that are not represented in
iso-8859-1 encoding?
Also Tika Parser throws an error at files bigger than 100 000 characters
(.pdfs with a lot of images). Skipping images in .pdf would help a lot. I
didn't find a way to increase the limit via XWiki configuration files.
Caused by: org.apache.tika.sax.TaggedSAXException: Your document contained
more than 100000 characters, and so your requested limit has been reached.
To receive the full text of the document, increase your limit. (Text up to
the limit is however available).
Any help is appreciated