Re: [xwiki-devs] [xwiki-notifications] r11859 - in sandbox/xwiki-plugin-officeimporter/src: main/java/com/xpn/xwiki/plugin/officeconverter main/resources test/java/com/xpn/xwiki/plugin/officeconverter test/resources

18 Aug 2008

On Aug 18, 2008, at 4:37 AM, Wang Ning wrote:
...
  Hi Vincent,
 On Mon, Aug 18, 2008 at 2:59 AM, Vincent Massol &lt;vincent(a)massol.net&gt;
 wrote:
  Hi Wang,
 I have some questions/remarks on the commit below:
 1) It's hard to see what you're modifying from the default tag infos
 (in default.xml). I think a better solution might be to extend
 DefaultTagProvider and use the constructor accepting an info
 provider,
 see http://htmlcleaner.sourceforge.net/doc/org/htmlcleaner
 HtmlCleaner.html#HtmlCleaner(org.htmlcleaner.ITagInfoProvider)
 . In this manner we just extend the default and not duplicate it.
 WDYT? Is this possible? Are you adding tags or redefining default
 one?
  I have seen this DefaultTagProvider, but I don't think it can provide
 what I want.
 I want to replace some tags with others. 
public class MyProvider extends DefaultTagProvided
{
   public TagInfo getTagInfo(String tagName)
   {
     if (tagName.equals("s") {
      return new TagInfo(....);
     } else {
       return super.getTagInfo(tagName)
   }
   }
}
or something like that.
I guess it looks better if we have only a few changes to make. If we
have a lot then the file solution might be better. In any case make
sure you separate and document your changes from the default in
default.xml.
...
  Such like replace <p> in <li>
 to <span>, replace deprecated tags with others. Like replace <s> to
 <del>. But the default.xml can define this.
  2) The following code is not required:
  +    private void removeHead(TagNode node)
 +    {
 +        logger.debug("remove the head tag of the html");
 +        TagNode head = node.findElementByName("head", true);
 +        head.removeFromTree();
 +    } 
 instead you can just use         props.setOmitHtmlEnvelope(true);  I have tried
props.setOmitHtmlEnvelope(true), but there are some
 errors. If input is
 <html>
 <head/>
 <body>
 <p>p1</p>
 <p>p2</p>
 </body>
 </html>
 the output TagNode is not <p>p1</p><p>p2</p> but only
<p>p2</p>. So I
 have to use removeHead() to remove the head tag. 
htmlcleaner seems quite buggy... Even with setOmitHtmlEnvelope to
false it still generate something strange with an extra line between
the 2 paragraphs of your example. Do you think you could report an
issue in the issue tracker of the htmlcleaner project so that they can
fix it in the next release?
...
   3) Could you
explain why we need convertAllAttribute2Lowercase?
 
 In xhtml, all the attributes value should be lowercase. 
Do you have a link to the XHTML spec for this? I can only find
http://www.w3.org/TR/xhtml1/#h-4.2
  but this is different.
I have only found this http://lists.xml.org/archives/xml-dev/200109/msg00718.html
I'd like to know what we gain by transforming in lowercase. Since it's
extra effort we need to rationalize it.
...
  And some
 attributes value in the conversion result of openoffice is uppercase,
 like algin="CENTER". I think lowercase will be better. But if
 xhtmlparser could handler uppercase attributes value, I will remove
 this method.
  4) Could you refactor your code to use the new
HTMLCleaner code I
 have
 put in xwiki-xml module? 
 I will do it today. 
cool
Thanks
-Vincent
...


 Feel free to provide patches for CleaningFilters.  [snip]
 --
 Thanks
 Wang Ning
 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-devs] [xwiki-notifications] r11859 - in sandbox/xwiki-plugin-officeimporter/src: main/java/com/xpn/xwiki/plugin/officeconverter main/resources test/java/com/xpn/xwiki/plugin/officeconverter test/resources