Hi,
Sorry for the delay, I've been busy with the Iaşi office setup and
other stuff. I read all the emails and gathered some observations that
I'll just paste here out of context.
"Different Syntaxes"
As a general observation, we should have support for loose document
metadata. We currently define all the properties inside the java
class, like author, creationDate, contentDate, template, format...
Most of these I don't even know what they are for, and if they work or
not. This way of defining metadata is bad, as it is hard to add a new
property, and once a property is added it will be there for all the
people to see, even if it was added for a client project with some
special requirements. We should provide a mechanism that allows adding
new metadata on the fly.
So, instead of adding a new XClass and adding XObjects to the
document, we can define a loose property "wikiSyntax".
"Different Syntaxes => Two possible solutions:"
Why two exclusive options, and not allow both? You can set the default
(wiki-wide) syntax, the document syntax, and the segment syntax. If
we're talking about a farm, then there's also the farm-wide default
syntax that is used when creating a new wiki. I don't think we really
need this, but if it is not hard to implement, then we should do it.
"HTMLParser: parses HTML syntax"
For the WYSIWYG editor, we can use a small trick that will save us a
lot of time and effort.
What if the HTMLRenderer can work in a "debug" mode, in which it
leaves some debug markings that can be used for reverse engineering
the HTML? For example:
----------
1 Example
This is an *important* example of how a complicated thing can be done
in an *easy* manner:
* instead of trying to parse some generated HTML back into wiki
syntax, we ~~explicitely~~ tell it what the wiki source was?
** using some kind of debug markers
* this way, we'll have a simple means of obtaining the wiki code.
-----
Other examples: [link>Main.WebHome12]
http://www.autodetected.com
<b>not detected</b> {image:a.png}
----------
When rendering for viewing, this will turn into:
----------
<h1 id="HExample">Example</h1><span
class="sectionEditMarker">[edit]</span>
This is an <strong>important</strong> example of how a complicated
thing can be done in an <strong>easy</strong> manner:
<ul class="star"><li>instead of trying to parse some generated HTML
back into wiki syntax, we <em>explicitely</em> tell it what the wiki
source was?
<ul class="star"><li>using some kind of debug
markers</li></ul
</li>
<li>this way, we'll have a simple means of obtaining the wiki code.</li>
</ul>
<hr/>
Other examples: <span class="createLink"><a
href="/xwiki/bin/view/Main/WebHome12">link</a></span> <a
href="http://www.autodetected.com">http://www.autodetected.com</a>
<b>not detected</b> <img src="..."/>
----------
When rendering for the WYSIWYG editor, this will turn into:
----------
<h1 id="HExample" xw:smarkup="1 ">Example</h1><span
class="sectionEditMarker" xw:skip="true">[edit]</span>
This is an <strong xw:smarkup="*"
xw:emarkup="*">important</strong>
example of how a complicated thing can be done in an <strong
xw:smarkup="*" xw:emarkup="*">easy</strong> manner:
<ul class="star" xw:ignore="true"><li xw:smarkup="*
"
xw:emarkup="\n">instead of trying to parse some generated HTML back
into wiki syntax, we <em xw:smarkup="~~"
xw:emarkup="~~">explicitely</em> tell it what the wiki source was?
<ul class="star" xw:ignore="true"><li xw:smarkup=" **
"
xw:emarkup="\n\n">using some kind of debug markers</li></ul
</li>
<li xw:smarkup="* " xw:emarkup="\n">this way, we'll have a
simple
means of obtaining the wiki code.</li>
</ul>
<hr wk:source="-----"/>
Other examples: <span class="createLink" xw:ignore="true"><a
href="/xwiki/bin/view/Main/WebHome12" xw:smarkup="["
xw:emarkup=">Main.WebHome]">link</a></span> <a
href="http://www.autodetected.com"
xw:smarkup="">http://www.autodetected.com</a> <b>not
detected</b>
<img src="..." wk:source="{image:a.png}"/>
----------
So, whenever we see an xw:ignore, xw:skip, xw:source, xw:smarkup or
xw:emarkup attribute, we know there was a wiki markup in there. The
HTMLParser would just look for these attributes, remove the xw:skip
tags (including the content), remove the xw:ignore tags (but not the
content), replace the start tag with the smarkup value, the end tag
with the emarkup value, or the entire content with the xw:source
attribute. We use the same rendering engine, and we have a simple
back-parser that doesn't need any changes whenever we add or modify
something in the syntax.
We have to make sure that:
- markup that generates more than one html element must add xw:skip or
xw:ignore on the ones that are not needed for back-parsing
- we don't use xw:source on elements that can have other nested
elements (like the {table} macro)
- we properly escape all the quotes and newlines in the attributes
- we write these attributes when entering new stuff in the editor, and
update them as we change the document.
I know that these attribute names or not good, I'll think of better
ones when I'll be less sleepy.
"Note: it would be better to capture them so that they are entered in
the "DOM". Is that possible?"
Yes, you can do almost anything in Javascript, just that the editor
will be much slower, since you'll have to listen to a lot of events,
and do a lot of processing. I think it would be better if we don't do
this.
"Now you have a good point, I don't see many more use cases :) and
thus I'd agree with you if we have another way of doing this
migration."
How about this use case:
A wiki dedicated to wiki engine comparison, like wikiconsultin, in
which fans of all the wiki engines could meet and debate. Now,
wouldn't it be nice to let the mediawiki fans write using the
mediawiki syntax, and the xwiki fans write using the xwiki syntax? And
all in the same wiki?
I'm +1 for the different syntaxes in the same wiki.
Does anyone know if WikiModel has syntax migration tools?
We can make one, if it doesn't have one. I remember that someone
(Stephane? Marius?) said something about wiki-to-wiki conversion,
though.
"HTMLParser: I think all parsers above need to support HTML since
the wiki syntaxes can be mixed with HTML"
I don't understand this. What does manually entered HTML have to do
with wiki parsing?
"One common storage syntax, multiple editing syntaxes"
I like this idea, but is it worth the effort? wiki editing is my
favorite because it is fast. If we'll need to parse and render twice
for an edit, knowing that these are among the most time consuming
step, I think a high-load XWiki will be much slower.
"document requested for edition are available from the database in a
serialized format, for instance XHTML"
I'd say that XHTML is not the best choice. Moreover, I'd say that we
should store the document as it is, and not using a "standard" markup.
"VelocityTextProcessor"
We'd have only this component for velocity processing? Do we need to
process velocity in another way? I can't think of any right now.
Sergiu