[xwiki-users] HTML DOM parser? (was: Xwiki.com API stability and Class/Object model)
THOMAS, BRIAN M (ATTSI)
bt0008 at att.com
Thu Apr 12 19:03:53 CEST 2007
Pablo:
Yes, I found the JTidy package (as an option to another one in XWiki's
libs), but the HTML parser's constructor seemed to depend on some other
stuff that I don't think I can synthesize; I think there's yet another
one in the XWiki distribution, but I haven't looked at it yet (low
priority for now).
Of course, I can always use Javascript, yuck (not that I dislike it per
se, but due to the hegemonious lusts of a certain corporation - the bane
of my professional existence, which I am ashamed to say is from my own
country - cross-browser compatibility rates just above un-anaesthetized
oral surgery on my personal list of preferences)...
I will investigate TagSoup, though; thanks.
brain[sic]
> -----Original Message-----
> From: Pablo Oliveira [mailto:pablo.oliveira at enst.fr]
> Sent: Thursday, April 12, 2007 9:07 AM
> To: xwiki-users at objectweb.org
> Subject: Re: [xwiki-users] Xwiki.com API stability and
> Class/Object model
>
> On Apr 06, THOMAS, BRIAN M (ATTSI) wrote :
>
> > From: Sergiu Dumitriu [mailto:sergiu.dumitriu at gmail.com]
> > Sent: Thursday, April 05, 2007 4:16 PM
> > To: xwiki-users at objectweb.org
> > Subject: Re: [xwiki-users] Xwiki.com API stability and
> Class/Object
> > model
> >
> > On 4/4/07, THOMAS, BRIAN M (ATTSI) <bt0008 at att.com> wrote:
> >
> >
> > The only reason I haven't already made a start
> of it is that I
> > haven't
> > found an HTML DOM parser. Is there one in the
> myriad of libraries
> > that
> > come with XWiki?
> >
> >
> >
> >
> > What do you mean by "HTML DOM parser"? You can use any
> DOM parser as
> > long as it's well formed XML, and it should be.
> >
> >
> > --
> > http://purl.org/net/sergiu
> >
> >
> > Unfortunately, it isn't:
> >
> > Nested exception: org.xml.sax.SAXParseException: The
> declaration for
> > the entity "HTML.Version" must end with '>'.
> >
> >
> >
> > This exception is thrown regardless of which of the javadoc pages I
> > use...
>
> Just my two cents:
> you might have a look at TagSoup
> (http://home.ccil.org/~cowan/XML/tagsoup/) or JTidy
> (http://jtidy.sourceforge.net/) which I think is distributed
> already as part of XWiki, those should help you when dealing
> with non xml-valid HTML.
>
> Pablo
>
>
More information about the users
mailing list