RE: [xwiki-users] HTML DOM parser? (was: Xwiki.com API stability and Class/Object model)

12 Apr 2007

Pablo:
Yes, I found the JTidy package (as an option to another one in XWiki's
libs), but the HTML parser's constructor seemed to depend on some other
stuff that I don't think I can synthesize; I think there's yet another
one in the XWiki distribution, but I haven't looked at it yet (low
priority for now).
Of course, I can always use Javascript, yuck (not that I dislike it per
se, but due to the hegemonious lusts of a certain corporation - the bane
of my professional existence, which I am ashamed to say is from my own
country - cross-browser compatibility rates just above un-anaesthetized
oral surgery on my personal list of preferences)...
I will investigate TagSoup, though; thanks.
brain[sic]
...
  -----Original Message-----
 From: Pablo Oliveira [mailto:pablo.oliveira@enst.fr]
 Sent: Thursday, April 12, 2007 9:07 AM
 To: xwiki-users(a)objectweb.org
 Subject: Re: [xwiki-users] Xwiki.com API stability and
 Class/Object model
 On Apr 06, THOMAS, BRIAN M (ATTSI) wrote :
         From: Sergiu Dumitriu
[mailto:sergiu.dumitriu@gmail.com]
        Sent: Thursday, April 05, 2007 4:16 PM
        To: xwiki-users(a)objectweb.org
        Subject: Re: [xwiki-users] Xwiki.com API stability and   Class/Object
  model
        On 4/4/07, THOMAS, BRIAN M (ATTSI) &lt;bt0008(a)att.com&gt; wrote:
                The only reason I haven't already made a start   of it is that I
  haven't
                found an HTML DOM parser.  Is there one in the   myriad of libraries
  that
                come with XWiki?
        What do you mean by "HTML DOM parser"? You can use any   DOM
parser as
  long as it's well formed XML, and it should
be.
        --
        http://purl.org/net/sergiu
  Unfortunately, it isn't:
 Nested exception: org.xml.sax.SAXParseException: The   declaration for
  the entity "HTML.Version" must end with
'>'.
 This exception is thrown regardless of which of the javadoc pages I
 use... 
 Just my two cents:
 you might have a look at TagSoup
 (http://home.ccil.org/~cowan/XML/tagsoup/) or JTidy
 (http://jtidy.sourceforge.net/) which I think is distributed
 already as part of XWiki, those should help you when dealing
 with non xml-valid HTML.
 Pablo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

RE: [xwiki-users] HTML DOM parser? (was: Xwiki.com API stability and Class/Object model)