Hi Asiri,
On Feb 24, 2009, at 7:26 AM, Asiri Rathnayake wrote:
Hi Vincent,
This is a bug in the XHTML parser. It should generate an embedded
document. This is true for any block element
inside a table cell.
However in order to get simpler xwiki syntax we could modify the
XWiki
Syntax Renderer to remove the embedded doc in case it contains only a
paragraph.
I will raise a JIRA issue for this.
Now that you asked about it, I might have been
working myself
around a
possible bug in rendering. But these are what I saw as solutions:
1. Wrap the paragraph inside <div class="xwiki-document"> : This
results in
enlarged table header elements.
why?
I'm talking with respect to the original word document. This is a
problem
with OO server's html generation because it generates a paragraph
inside
each table cell / table header item, the generated html kind of looks
enlarged when rendered on a browser. Also, since we strip those
<style>
tags, the content gets even more enlarged.
I was asking why having <div class="xwiki-document"> didn't work
nicely since this is the correct behavior. We should get:
<td><div
class="xwiki-document"><p>whatever</p></div></td>
I don't understand why this would not be represented the same as in OO.
To work around this problem I chose to strip any
isolated paragraph
elements
found inside table cells / table header items.
2. Remove the paragraph if it's an isolated
one (only one paragraph
inside
the 'th' element) if there are more than one paragraph or other
elements
(like lists), then wrap the content within the 'th' element inside a
<div
class="xwiki-document">
I've been using the second approach because it yielded the best
results so
far... Now, have i been working around a bug which should be fixed
in
rendering? :)
I think so. In addition you haven't fixed the problem in the general
case. For example if someone chooses HTML 4.01 syntax in wiki pages.
Even if the problem was not in the
parser/renderer you should still
have moved it in the default HTML cleaner and not in the office
cleaner IMO since I don't see the relationship with office import.
I don't think this is correct. If the user chooses HTML 4.01 syntax,
he
knows what is doing and he expects table cells / table header items to
appear large if he puts a <p> inside a <td> item or <th> item.
This is not about large or not large (l&f is handled by the CSS only)
and we need to normalize the HMTL in exactly the same manner.
But the story
is different for OO generated html which puts a paragraph element
when there
shouldn't be one.
I don't agree since it's very valid to have <p> inside cells and not a
OO problem.
That is why i beleived that this particular issue
belongs
to officeimporter module and not html cleaner module.
I still think the HTML parser should generate the following events:
beginCell, beginDocument, beginPara, onWord, endPara, endDocument,
endCell.
I also still think that, as an optimization, the Wiki Syntax Renderer
should removed the embedded doc in case there's a single para in the
embedded doc.
Thanks
-Vincent
http://xwiki.com
http://xwiki.org
http://massol.net