Hi devs,
Trying to fix
http://jira.xwiki.org/browse/XWIKI-9753 I found 4
plausible solutions (other than switching to something better than
HTMLCleaner):
1. Provide a PdfHTMLCleaner implementation which inherits
DefaultHTMLCleaner but doesn't omit unknown tags
2. Change the DefaultHTMLCleaner so that it doesn't omit unknown tags
3. Instead of relying on org.htmlcleaner.DefaultTagProvider, use
org.htmlcleaner.ConfigFileTagProvider with a custom tag configuration
that allows all HTML5+SVG+MathML, which is a good idea if we want to
switch to HTML5 in the future
4. Add another custom key for HTMLCleanerConfiguration for switching
this unknown tags setting on or off
Which do you prefer?
1 is quick to implement, but might introduce some other regressions in
the PDF export if we're allowing all unknown tags
2 might be a dangerous change, which will break other things
3 is a clean fix, but involves some work
4 is also quick to implement, but it means adding another custom setting
that limits the possible choices for the HTMLCleaner backend
I vote for 4 as a quick fix, with 3 as a long term goal. WDYT?
--
Sergiu Dumitriu
http://purl.org/net/sergiu/