[xwiki-devs] [Discussion] Designing the new Rendering/Parsing component/API

List overview All Threads
Download

newer

older

[xwiki-devs] Display the edit...

[xwiki-devs] [VOTE] Release XE...

Vincent Massol

11 Sep 2007 11 Sep '07

1:46 p.m.

Hi, I've started working on designing the new Rendering/Parsing components and API for XWiki. The implementation will be based on WikiModel but we need some XWiki wrapping interfaces around it. Note that this is a prerequisite for the new WYSIWYG editor based on GWT (see http://www.xwiki.org/xwiki/bin/view/Design/ NewWysiwygEditorBasedOnGwt). I've updated http://www.xwiki.org/xwiki/bin/view/Design/ WikiModelIntegration with the information below, which I'm pasting here so that we can have a discussion about it. I'll consolidate the results on that wiki page. Componentize the Parsing/Rendering APIs ================================== We need 4 main components: * A Scripting component to manage scripting inside XWiki documents and to evaluate them. * A Rendering component to manage rendering Wiki syntax into HTML and other (PDF, RTF, etc) * A Wiki Parser component to offer a typed interface to XWiki content so that it can be manipulated * A HTML Parser component (for the WYSIWYG editor) Different Syntaxes =============== Two possible solutions: 1. Have a WikiSyntax Object (A simple class with one property: a combox box with different syntaxes: XWiki Legacy, Creole, MediaWiki, Confluence, JSPWiki, etc) that users can attach to pages to tell the Renderers what syntax is used. If no such object is attached then it'll default to XWiki's default syntax (XWiki Legacy or Creole for example). 2. Have some special syntax, independent of the wiki syntaxes to tell the Rendered that such block of content should be rendered with that given syntax. Again there would be a default. XWiki Interfaces ============= * ScriptingEngineManager: Manages the different Scripting Engines, calling them in turn. * ScriptingEngine o Method: evaluate(String content) o Implementation: VelocityScriptingEngine o Implementation: GroovyScriptingEngine * RenderingEngineManager: Manages the different Rendering Engines, calling them in turn. * RenderingEngine o Method: render(String content) o Implementation: XWikiLegacyRenderingEngine (current rendering engine) o Implementation: WikiModelRenderingEngine * Parser: content parsing o HTMLParser: parses HTML syntax o WikiParser: parses wiki syntax o Implementation: WikiModelHTMLParser o Implementation: WikiModelWikiParser Open Questions: * Does WikiModel support a generic syntax for macros? * Is the Rendering also in charge of generating PDF, RTF, XML, etc? o I think so, need to modify interfaces above to reflect this. * The WikiParser needs to recognizes scripts since this is needed for the WYSIWYG editor. Use cases ======== * View page o ViewAction -- template -> ScriptingEngineManager.evaluate () -- wiki syntax -> RenderingEngineManager.render() ---> HTML, XML, PDF, RTF, etc * Edit page in WYSIWYG editor o Uses the WikiParser to create a "DOM" of the page content and to render it accordingly. NOTE: This is required since rendering in the WYSIWYG editor is different from the final rendering. For example, macros need to be shown in a special way to make them visible, etc. o Changes done by the user are entered in HTML. Note: it would be better to capture them so that they are entered in the "DOM". Is that possible? If not, then the HTMLParser is used to convert from HTML to Wiki Syntax but they're likely be some loss in the conversion. The advantage is the ability to take any HTML content and generate wiki syntax from it. This is my very earlier thinking but I wanted to make it visible to give everyone the change to 1) know what's happening and 2) suggest ideas. I'll refine this in the coming days and post again on this thread. Thanks -Vincent

Show replies by date

Jean-Vincent Drean

11 Sep 11 Sep

6:36 p.m.

...

Two possible solutions: 1. Have a WikiSyntax Object (A simple class with one property: a combox box with different syntaxes: XWiki Legacy, Creole, MediaWiki, Confluence, JSPWiki, etc) that users can attach to pages to tell the Renderers what syntax is used. If no such object is attached then it'll default to XWiki's default syntax (XWiki Legacy or Creole for example). 2. Have some special syntax, independent of the wiki syntaxes to tell the Rendered that such block of content should be rendered with that given syntax. Again there would be a default.

3) A new "syntax" document meta ? WDYT ?

Vincent Massol

7:08 p.m.

On Sep 11, 2007, at 6:36 PM, Jean-Vincent Drean wrote:

...

3) A new "syntax" document meta ? WDYT ?

Yes this is like solution 1 in that you cannot mix syntaxes within a given document whereas solution 2 allows that. Not sure which is best. Thanks -Vincent

Jean-Vincent Drean

7:22 p.m.

...

Yes this is like solution 1 in that you cannot mix syntaxes within a given document whereas solution 2 allows that.

I don't see benefits from being able to mix different syntaxes within a document so I'd give +1 to solution 1 (with a meta). In fact I'm not sure to see the advantages of a syntax choice at the document level (I'd have naturaly put it at the wiki level). Sorry if it has already been discussed on the list.

Vincent Massol

8:57 p.m.

On Sep 11, 2007, at 7:22 PM, Jean-Vincent Drean wrote:

...

Yes this is like solution 1 in that you cannot mix syntaxes within a given document whereas solution 2 allows that.

It hasn't been discussed AFAIR... so this is the right place and time to discuss it. I can think of one use case: * I've an existing wiki (says xwiki.org) which is using the old wiki syntax (the current one) and I want to migrate it to the new syntax (say it's creole for example). It would be nice if I could mix the 2 syntaxes during this migration. Now you have a good point, I don't see many more use cases :) and thus I'd agree with you if we have another way of doing this migration. Does anyone know if WikiModel has syntax migration tools? Thanks -Vincent

Erin Schnabel

14 Sep 14 Sep

4:16 p.m.

On 9/11/07, Jean-Vincent Drean <jv(a)xwiki.com> wrote:

...

Yes this is like solution 1 in that you cannot mix syntaxes within a given document whereas solution 2 allows that.

While it would seemingly be confusing, one reason I can think of is migration... pile of old documents with old syntax, want new documents to use new syntax... (what a headache that would be.. but if you have gobs of docs... ) -- 'Waste of a good apple' -Samwise Gamgee

Fabio Mancinelli

11 Sep 11 Sep

10:43 p.m.

On Sep 11, 2007, at 1:46 PM, Vincent Massol wrote:

...

1. Have a WikiSyntax Object (A simple class with one property: a combox box with different syntaxes: XWiki Legacy, Creole, MediaWiki, Confluence, JSPWiki, etc) that users can attach to pages to tell the Renderers what syntax is used. If no such object is attached then it'll default to XWiki's default syntax (XWiki Legacy or Creole for example). 2. Have some special syntax, independent of the wiki syntaxes to tell the Rendered that such block of content should be rendered with that given syntax. Again there would be a default.

Regarding multiple syntaxes... Is it really a requirement to support them? I think that adding this feature could lead to confusion. For example, imagine an XWiki-Pedia as sketched in 1. A contributor could potentially need to edit 3 documents written in 3 different syntaxes. Solution 2. is even worse because it pushes this heterogeneity down to the document level. An alternative solution could be to have a model for the page that is "presented" in different syntaxes when the page is edited. In this way a document can be edited in whatever syntax the user feels comfortable with. However this raises several issues... A model that is the LCM of all the supported syntaxes is fine, but what happens when an element in the model is requested to be presented in a syntax that doesn't support it? The best option, probably, is to keep it simple by having and supporting only a single syntax for editing pages, with some "import" plugins that can translate a previously existing content to that syntax when a new page is created. Sorry if this was an already discussed topic or if I missed some relevant detail :) Cheers, Fabio

Vincent Massol

12 Sep 12 Sep

8:52 a.m.

On Sep 11, 2007, at 10:43 PM, Fabio Mancinelli wrote:

...

On Sep 11, 2007, at 1:46 PM, Vincent Massol wrote:

I agree with all you said.

...

The best option, probably, is to keep it simple by having and supporting only a single syntax for editing pages, with some "import" plugins that can translate a previously existing content to that syntax when a new page is created.

I think I prefer the option suggested by JV, i.e. have a configuration option at the level of the wiki to decide what syntax a wiki is using. Imports are required too but since WikiModel supports several syntaxes why not offer this feature to users. That would help a lot for people who come from an existing wiki and want to switch to xwiki. Of course, in my initial email there's an open question about WikiModel's syntax of macros across syntaxes that we need to resolve. Thanks -Vincent

...

Sorry if this was an already discussed topic or if I missed some relevant detail :) Cheers, Fabio

Jean-Vincent Drean

11:04 a.m.

...

The wiki pages importer will need to know from which syntax he is importing, so we still have 2 choices : 1) Have the syntax set at the wiki level and exported at the XAR level (package.xml) 2) Have a default syntax set at the wiki level and a mandatory meta in the documents (set at the creation, not assignable from the interface) : - we could write a new edit panel which would appear in the case were currentDocument is not using the default syntax, this panel would allow to switch the document to the current syntax. - it allows to change the default syntax (wiki level) without an automatic wiki-wide syntax transformation. WDYT ?

Stéphane Laurière

13 Sep 13 Sep

6:37 p.m.

New subject: [xwiki-devs] [Discussion] Designing the new Rendering/Parsing component/API

Hi Vincent, hi everyone, We discussed the WikiModel integration with Mikhail this afternoon. Here is below our input. Vincent Massol wrote:

...

On the topic of scripting we would like to propose a distinction between scripts that act on text and scripts that act on the DOM. Typically, the text rendering processing for flow would be the following, for say "text1": text1 =TextProcessor=> text2 =Parser=> dom1 =DomProcessor=> dom2 => ... - the scripts contained in text1 are processed in the context of user1, this results into a new text: text2 - the parser parses text2 and converts text2 to a DOM tree, dom1 - dom1 is processed by scripts that work directly on the DOM (example: table of content generator), this results in dom2 - dom2 is made to available as such or is converted to XML, HTML, PDF etc. depending on the user request TextProcessor and DomProcessor would have the following interfaces: TextProcessor - String execute(String content) DomProcessor - DOM execute(DOM content) That means we should have a syntax to distinguish between scripts that generate text content, and scripts that manipulate the DOM.

...

* A Rendering component to manage rendering Wiki syntax into HTML and other (PDF, RTF, etc) * A Wiki Parser component to offer a typed interface to XWiki content so that it can be manipulated * A HTML Parser component (for the WYSIWYG editor) Different Syntaxes =============== Two possible solutions: 1. Have a WikiSyntax Object (A simple class with one property: a combox box with different syntaxes: XWiki Legacy, Creole, MediaWiki, Confluence, JSPWiki, etc) that users can attach to pages to tell the Renderers what syntax is used. If no such object is attached then it'll default to XWiki's default syntax (XWiki Legacy or Creole for example). 2. Have some special syntax, independent of the wiki syntaxes to tell the Rendered that such block of content should be rendered with that given syntax. Again there would be a default.

Here's our view regarding the syntax used in wiki edit mode: document requested for edition are available from the database in a serialized format, for instance XHTML. When entering into the edit action, the user indicates his preferred syntax. If the text of the requested document contains some blocks that are not handled by the chosen syntax, the user gets a warning (example: the document contains a table as a list item, and the user tries to edit the document using JSPWiki syntax). If not, WikiModel converts the serialized format into a DOM, the user edits the DOM and the WikiModel serializer serializes it back when the user saves it. Note that the DOM representation of wiki documents in the latest version of WikiModel is still pending.

...

XWiki Interfaces ============= * ScriptingEngineManager: Manages the different Scripting Engines, calling them in turn. * ScriptingEngine o Method: evaluate(String content) o Implementation: VelocityScriptingEngine o Implementation: GroovyScriptingEngine * RenderingEngineManager: Manages the different Rendering Engines, calling them in turn. * RenderingEngine o Method: render(String content) o Implementation: XWikiLegacyRenderingEngine (current rendering engine) o Implementation: WikiModelRenderingEngine * Parser: content parsing o HTMLParser: parses HTML syntax o WikiParser: parses wiki syntax o Implementation: WikiModelHTMLParser o Implementation: WikiModelWikiParser Open Questions: * Does WikiModel support a generic syntax for macros?

WikiModel generates events for blocks that are not to be parsed (typically because they contain scripts). For example, in the WikiModel syntax currently called "CommonSyntax", this looks like the following: ============== {{{macro:mymacro (String parameters) dothis dothat }}} $mymacro(parameters) ============== For each syntax, macro blocks are identified as far as possible (we still have to check it's the case for all types of macro blocks inde indeed).

...

* Is the Rendering also in charge of generating PDF, RTF, XML, etc? o I think so, need to modify interfaces above to reflect this. * The WikiParser needs to recognizes scripts since this is needed for the WYSIWYG editor.

the WikiModel parser recognizes scripts indeed. Mikhail and Stéphane

...

Use cases ======== * View page o ViewAction -- template -> ScriptingEngineManager.evaluate () -- wiki syntax -> RenderingEngineManager.render() ---> HTML, XML, PDF, RTF, etc * Edit page in WYSIWYG editor o Uses the WikiParser to create a "DOM" of the page content and to render it accordingly. NOTE: This is required since rendering in the WYSIWYG editor is different from the final rendering. For example, macros need to be shown in a special way to make them visible, etc. o Changes done by the user are entered in HTML. Note: it would be better to capture them so that they are entered in the "DOM". Is that possible? If not, then the HTMLParser is used to convert from HTML to Wiki Syntax but they're likely be some loss in the conversion. The advantage is the ability to take any HTML content and generate wiki syntax from it. This is my very earlier thinking but I wanted to make it visible to give everyone the change to 1) know what's happening and 2) suggest ideas. I'll refine this in the coming days and post again on this thread. Thanks -Vincent _______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs

-- Stéphane Laurière slauriere(a)xwiki.com XWiki http://www.xwiki.com http://concerto.xwiki.com http://nepomuk.semanticdesktop.org

Vincent Massol

14 Sep 14 Sep

10:59 a.m.

+1 to all that. So let me summarizes and rephrase to see if I have understood :) 1) We have 4 types of objects: * TextProcessors: take text and generate text * Parsers: take text and generate an internal DOM format (pivot format) * DomProcessors: take DOM and generate DOM * Renderers: take DOM and generate anything (text, PDF, RTF, HTML, XML, etc) 2) Document contents are stored in the database in textual format in the main xwiki syntax (whatever we decide it is - we could standardize on creole for example) 3) Use case 1: Viewing a document a) Get the doc from the DB --> text1 (xwiki text format) b) Apply TextProcessors --> text2 c) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an internal DOM) d) Apply DomProcessors --> DOM2 e) Call the required Renderer --> PDF, XML, HTML, RTF, text, etc 4) Use case 2: Editing a document, assuming the user wants to use the MediaWiki syntax for editing a) Get the doc from the DB --> text1 (xwiki text format) b) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an internal DOM) c) Call MediaWikiRenderer --> text2 (text in MediaWiki format) d) the user edits and hits save e) MediaWikiParser --> DOM2 (transforms MediaWiki text syntax into the internal DOM) f) Call XWikiRenderer --> text" (transforms DOM into xwiki textual format) g) Save text3 in the database 5) In practice this means the following classes: * TextProcessorManager: to chain several text processors * TextProcessor - VelocityTextProcessor - GroovyTextProcessor * WikiParser: Takes wiki syntax and generates a DOM in a XWiki- specific format (independent of the different wiki syntaxes). - LegacyXWikiWikiParser - XWikiWikiParser (or simply use CreoleWikiParser if we want our internal format to be Creole) - ConfluenceWikiParser - MediaWikiWikiParser - JSPWikiWikiParser - CreoleWikiParser - HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML. So this HTMLParser is probably a parent of the other parsers in some regard. Anyway we need this one for the WYSIWYG editor which may need to transform HTML to wiki syntax (so we may need a XWikiDomProcessor too to transform into XWiki syntax). The alternative (much better) is to have the WYSIWYG editor only use the internal XWiki-specific DOM format for all its manipulations. * DomProcessorManager: to chain several DOM processors * DomProcessor - Don't know yet what we're going to use this for. TOCDomProcessor as you say above maybe. * Renderer - XMLRenderer - HTMLRenderer - PDFRenderer - RTFRenderer - XWikiRenderer (or simply use CreoleRenderer if we want our internal format to be Creole) - ConfluenceRenderer - MediaWikiRenderer - JSPWikiRenderer - CreoleRenderer WDYT? Do I have it right? :) Thanks -Vincent On Sep 13, 2007, at 6:37 PM, Stéphane Laurière wrote:

...

Hi Vincent, hi everyone, We discussed the WikiModel integration with Mikhail this afternoon. Here is below our input. Vincent Massol wrote:

the WikiModel parser recognizes scripts indeed. Mikhail and Stéphane > > Use cases > ======== > > * View page > o ViewAction -- template -> > ScriptingEngineManager.evaluate > () -- wiki syntax -> RenderingEngineManager.render() ---> HTML, XML, > PDF, RTF, etc > * Edit page in WYSIWYG editor > o Uses the WikiParser to create a "DOM" of the page > content and to render it accordingly. NOTE: This is required since > rendering in the WYSIWYG editor is different from the final > rendering. For example, macros need to be shown in a special way to > make them visible, etc. > o Changes done by the user are entered in HTML. Note: it > would be better to capture them so that they are entered in the > "DOM". Is that possible? If not, then the HTMLParser is used to > convert from HTML to Wiki Syntax but they're likely be some loss in > the conversion. The advantage is the ability to take any HTML content > and generate wiki syntax from it. > > > This is my very earlier thinking but I wanted to make it visible to > give everyone the change to 1) know what's happening and 2) suggest > ideas. > > I'll refine this in the coming days and post again on this thread. > > Thanks > -Vincent

Erin Schnabel

4:52 p.m.

I should never ever reply to documents until I get to the end of the note chain. My apologies for being redundant: This is getting complicated. My view (because of what I know our documents are loaded with: A) velocity script just intermixed with text has to be supported. B) The wiki formatting we have now (loosely based on Radeon) has to work, with {code} and {pre} tags actually working (which is fixed in 1.1, I understand.. I've yet to move up It's my view that having all this macro stuff (between groovy, velocity, the macro defs, etc) jammed into the XWikiPreferences doc is inappropriate-- That doc in general is too big. I think, for example, that there should be a separate doc for "ad" configuration that you only use if ads are enabled, etc. That's a separate discussion, I think. ;) But anyway, remember consumability with this: if we're mucking around with the syntax that's supported, the macros that are supported, etc. those should be specified in a separate prefs doc. On 9/14/07, Vincent Massol <vincent(a)massol.net> wrote:

...

[snip....]

...

> > WikiModel generates events for blocks that are not to be parsed > (typically because they contain scripts). > > For example, in the WikiModel syntax currently called "CommonSyntax", > this looks like the following: > ============== > {{{macro:mymacro (String parameters) > dothis > dothat > > }}} > > > $mymacro(parameters) > ============== > > For each syntax, macro blocks are identified as far as possible (we > still have to check it's the case for all types of macro blocks inde > indeed). >

-- 'Waste of a good apple' -Samwise Gamgee

Sergiu Dumitriu

18 Sep 18 Sep

2:32 a.m.

Hi, Sorry for the delay, I've been busy with the Iaşi office setup and other stuff. I read all the emails and gathered some observations that I'll just paste here out of context. "Different Syntaxes" As a general observation, we should have support for loose document metadata. We currently define all the properties inside the java class, like author, creationDate, contentDate, template, format... Most of these I don't even know what they are for, and if they work or not. This way of defining metadata is bad, as it is hard to add a new property, and once a property is added it will be there for all the people to see, even if it was added for a client project with some special requirements. We should provide a mechanism that allows adding new metadata on the fly. So, instead of adding a new XClass and adding XObjects to the document, we can define a loose property "wikiSyntax". "Different Syntaxes => Two possible solutions:" Why two exclusive options, and not allow both? You can set the default (wiki-wide) syntax, the document syntax, and the segment syntax. If we're talking about a farm, then there's also the farm-wide default syntax that is used when creating a new wiki. I don't think we really need this, but if it is not hard to implement, then we should do it. "HTMLParser: parses HTML syntax" For the WYSIWYG editor, we can use a small trick that will save us a lot of time and effort. What if the HTMLRenderer can work in a "debug" mode, in which it leaves some debug markings that can be used for reverse engineering the HTML? For example: ---------- 1 Example This is an *important* example of how a complicated thing can be done in an *easy* manner: * instead of trying to parse some generated HTML back into wiki syntax, we ~~explicitely~~ tell it what the wiki source was? ** using some kind of debug markers * this way, we'll have a simple means of obtaining the wiki code. ----- Other examples: [link>Main.WebHome12] http://www.autodetected.com <b>not detected</b> {image:a.png} ---------- When rendering for viewing, this will turn into: ---------- <h1 id="HExample">Example</h1><span class="sectionEditMarker">[edit]</span> This is an <strong>important</strong> example of how a complicated thing can be done in an <strong>easy</strong> manner: <ul class="star"><li>instead of trying to parse some generated HTML back into wiki syntax, we <em>explicitely</em> tell it what the wiki source was? <ul class="star"><li>using some kind of debug markers</li></ul </li> <li>this way, we'll have a simple means of obtaining the wiki code.</li> </ul> <hr/> Other examples: <span class="createLink"><a href="/xwiki/bin/view/Main/WebHome12">link</a></span> <a href="http://www.autodetected.com">http://www.autodetected.com</a> <b>not detected</b> <img src="..."/> ---------- When rendering for the WYSIWYG editor, this will turn into: ---------- <h1 id="HExample" xw:smarkup="1 ">Example</h1><span class="sectionEditMarker" xw:skip="true">[edit]</span> This is an <strong xw:smarkup="*" xw:emarkup="*">important</strong> example of how a complicated thing can be done in an <strong xw:smarkup="*" xw:emarkup="*">easy</strong> manner: <ul class="star" xw:ignore="true"><li xw:smarkup="* " xw:emarkup="\n">instead of trying to parse some generated HTML back into wiki syntax, we <em xw:smarkup="~~" xw:emarkup="~~">explicitely</em> tell it what the wiki source was? <ul class="star" xw:ignore="true"><li xw:smarkup=" ** " xw:emarkup="\n\n">using some kind of debug markers</li></ul </li> <li xw:smarkup="* " xw:emarkup="\n">this way, we'll have a simple means of obtaining the wiki code.</li> </ul> <hr wk:source="-----"/> Other examples: <span class="createLink" xw:ignore="true"><a href="/xwiki/bin/view/Main/WebHome12" xw:smarkup="[" xw:emarkup=">Main.WebHome]">link</a></span> <a href="http://www.autodetected.com" xw:smarkup="">http://www.autodetected.com</a> <b>not detected</b> <img src="..." wk:source="{image:a.png}"/> ---------- So, whenever we see an xw:ignore, xw:skip, xw:source, xw:smarkup or xw:emarkup attribute, we know there was a wiki markup in there. The HTMLParser would just look for these attributes, remove the xw:skip tags (including the content), remove the xw:ignore tags (but not the content), replace the start tag with the smarkup value, the end tag with the emarkup value, or the entire content with the xw:source attribute. We use the same rendering engine, and we have a simple back-parser that doesn't need any changes whenever we add or modify something in the syntax. We have to make sure that: - markup that generates more than one html element must add xw:skip or xw:ignore on the ones that are not needed for back-parsing - we don't use xw:source on elements that can have other nested elements (like the {table} macro) - we properly escape all the quotes and newlines in the attributes - we write these attributes when entering new stuff in the editor, and update them as we change the document. I know that these attribute names or not good, I'll think of better ones when I'll be less sleepy. "Note: it would be better to capture them so that they are entered in the "DOM". Is that possible?" Yes, you can do almost anything in Javascript, just that the editor will be much slower, since you'll have to listen to a lot of events, and do a lot of processing. I think it would be better if we don't do this. "Now you have a good point, I don't see many more use cases :) and thus I'd agree with you if we have another way of doing this migration." How about this use case: A wiki dedicated to wiki engine comparison, like wikiconsultin, in which fans of all the wiki engines could meet and debate. Now, wouldn't it be nice to let the mediawiki fans write using the mediawiki syntax, and the xwiki fans write using the xwiki syntax? And all in the same wiki? I'm +1 for the different syntaxes in the same wiki. Does anyone know if WikiModel has syntax migration tools? We can make one, if it doesn't have one. I remember that someone (Stephane? Marius?) said something about wiki-to-wiki conversion, though. "HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML" I don't understand this. What does manually entered HTML have to do with wiki parsing? "One common storage syntax, multiple editing syntaxes" I like this idea, but is it worth the effort? wiki editing is my favorite because it is fast. If we'll need to parse and render twice for an edit, knowing that these are among the most time consuming step, I think a high-load XWiki will be much slower. "document requested for edition are available from the database in a serialized format, for instance XHTML" I'd say that XHTML is not the best choice. Moreover, I'd say that we should store the document as it is, and not using a "standard" markup. "VelocityTextProcessor" We'd have only this component for velocity processing? Do we need to process velocity in another way? I can't think of any right now. Sergiu

Vincent Massol

19 Sep 19 Sep

1:19 p.m.

Hi Sergiu, On Sep 18, 2007, at 2:32 AM, Sergiu Dumitriu wrote: [snip]

...

"Different Syntaxes" As a general observation, we should have support for loose document metadata. We currently define all the properties inside the java class, like author, creationDate, contentDate, template, format... Most of these I don't even know what they are for, and if they work or not. This way of defining metadata is bad, as it is hard to add a new property, and once a property is added it will be there for all the people to see, even if it was added for a client project with some special requirements. We should provide a mechanism that allows adding new metadata on the fly. So, instead of adding a new XClass and adding XObjects to the document, we can define a loose property "wikiSyntax".

Looks fine to me.

...

"Different Syntaxes => Two possible solutions:" Why two exclusive options, and not allow both? You can set the default (wiki-wide) syntax, the document syntax, and the segment syntax. If we're talking about a farm, then there's also the farm-wide default syntax that is used when creating a new wiki. I don't think we really need this, but if it is not hard to implement, then we should do it.

Right. Note that this is not even an issue if we agree about presenting the doc content in the user's syntax of choice. See below.

...

"HTMLParser: parses HTML syntax" For the WYSIWYG editor, we can use a small trick that will save us a lot of time and effort. What if the HTMLRenderer can work in a "debug" mode, in which it leaves some debug markings that can be used for reverse engineering the HTML? For example:

[snip] That's a good idea. It would still be nice if the WYSIWYG editor was using our DOM tree as its source instead of HTML or alternatively to enter data typed into both the HTML DOM tree and XWiki's DOM tree. But I'm unsure how feasible this is.

...

"Note: it would be better to capture them so that they are entered in the "DOM". Is that possible?" Yes, you can do almost anything in Javascript, just that the editor will be much slower, since you'll have to listen to a lot of events, and do a lot of processing. I think it would be better if we don't do this.

Maybe this would needed to be verified in term of speed.

...

"Now you have a good point, I don't see many more use cases :) and thus I'd agree with you if we have another way of doing this migration." How about this use case: A wiki dedicated to wiki engine comparison, like wikiconsultin, in which fans of all the wiki engines could meet and debate. Now, wouldn't it be nice to let the mediawiki fans write using the mediawiki syntax, and the xwiki fans write using the xwiki syntax? And all in the same wiki?

This is a very special use case (same as is the migration one).

...

I'm +1 for the different syntaxes in the same wiki.

Again this is moot if we agree about presenting the content in the user's preferred choice. I'm just worried about the complexity inherent to that solution but if WikiModel solves most of it, why not.

...

Does anyone know if WikiModel has syntax migration tools? We can make one, if it doesn't have one. I remember that someone (Stephane? Marius?) said something about wiki-to-wiki conversion, though.

Mikhkail, Stephane, any input on this?

...

"HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML" I don't understand this. What does manually entered HTML have to do with wiki parsing?

Because in wiki content user can introduce HTML.

...

"One common storage syntax, multiple editing syntaxes" I like this idea, but is it worth the effort? wiki editing is my favorite because it is fast. If we'll need to parse and render twice for an edit, knowing that these are among the most time consuming step, I think a high-load XWiki will be much slower.

Yes maybe. It would still be nice to get some performance figures before deciding. Mikhail/Stephane could you tell us the current performance of WikiModel for: * wiki syntax to DOM * DOM to wiki syntax (say for an XWiki renderer for example) In the following:

...

a) Get the doc from the DB --> text1 (xwiki text format) b) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an internal DOM) c) Call MediaWikiRenderer --> text2 (text in MediaWiki format) d) the user edits and hits save e) MediaWikiParser --> DOM2 (transforms MediaWiki text syntax into the internal DOM) f) Call XWikiRenderer --> text" (transforms DOM into xwiki textual format) g) Save text3 in the database

I have the feeling that b), c), e) and f) are going to be insignificant compared to a) and g) but I may be wrong. Some figures would be nice.

...

"document requested for edition are available from the database in a serialized format, for instance XHTML" I'd say that XHTML is not the best choice. Moreover, I'd say that we should store the document as it is, and not using a "standard" markup.

Agreed (see my previous email too)

...

"VelocityTextProcessor" We'd have only this component for velocity processing? Do we need to process velocity in another way? I can't think of any right now.

Let's see if we need more when we implement it. To summarize we need to decide on the topic of whether we want to display wiki content in the user's preferred syntax or not. To be honest I also like that a lot since this is not something you find in other wikis but I'm worried about 2 things: A) feasibility. Aren't there always going to be lots of incompatibilities? Macros can be generic and work for all syntaxes so that's not an issue but what about links for example. XWiki's link syntax is richer than most other wiki's link syntax. For example if there's a reference to a another xwiki db in the link (otherwiki:SomeDocument) then what's going to happen when viewed with, say, a mediawiki syntax? B) Complexity. Every user is going to be using his favorite syntax and thus when users talk together, copy/paste snippets on xwiki.org for example, they're all going to be in different syntaxes. Maybe these 2 points aren't going to be an issue but I'd like to make sure they're not since this is an important decision and what we gain from implementing it is not so high in my opinion when compared with the option of deciding the syntax at the level of the page or the whole wiki. Actually if there are performance issues it might even be possible to combine both: * the page is edited in the default syntax (the one configured at the page level or wiki level) * there's an export option to export in a different syntax, same as exporting in PDF, RTF, etc. Thanks -Vincent

Stéphane Laurière

20 Sep 20 Sep

12:40 p.m.

New subject: [xwiki-devs] [Discussion] Designing the new Rendering/Parsing component/API

Vincent Massol wrote:

...

Hi Sergiu, On Sep 18, 2007, at 2:32 AM, Sergiu Dumitriu wrote: [snip]

Looks fine to me.

Right. Note that this is not even an issue if we agree about presenting the doc content in the user's syntax of choice. See below.

Maybe this would needed to be verified in term of speed.

This is a very special use case (same as is the migration one).

I'm +1 for the different syntaxes in the same wiki.

Does anyone know if WikiModel has syntax migration tools? We can make one, if it doesn't have one. I remember that someone (Stephane? Marius?) said something about wiki-to-wiki conversion, though.

Mikhkail, Stephane, any input on this?

currently, WikiModel provides a set of parsers for MediaWiki, JSPWiki, Creole, XWiki, GWiki and XHTML. For wiki-to-wiki conversion, we now need a set of serializers (also called renderers previously). I'm writing one for XWiki. Then the idea would be to add other serializers in order to show how to convert one syntax to the other.

...

"HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML" I don't understand this. What does manually entered HTML have to do with wiki parsing?

Because in wiki content user can introduce HTML.

ok, we'll compute some parsing and rendering metrics and see how WikiModel compares with Radeox. I'll let you know asap about the results.

...

In the following:

I have the feeling that b), c), e) and f) are going to be insignificant compared to a) and g) but I may be wrong. Some figures would be nice.

...

Agreed (see my previous email too)

"VelocityTextProcessor" We'd have only this component for velocity processing? Do we need to process velocity in another way? I can't think of any right now.

I see two options: - we tell the user he will loose some information in using the mediawiki syntax in that case - we complement the supported syntaxes so that they cover exactly the same semantic capabilities. Mikhail, would this make sense?

...

B) Complexity. Every user is going to be using his favorite syntax and thus when users talk together, copy/paste snippets on xwiki.org for example, they're all going to be in different syntaxes.

Right, but on xwiki.org we can choose to display snippets in a uniform syntax whatever syntax was used for entering them, using when possible the one set in the user preferences, or using the richest one. When sharing snippets that are not rendered by XWiki, it's true users may use various syntaxes. We may however consider recommending one preferred syntax, the richest one, i.e. WikiModel CommonSyntax.

...

Maybe these 2 points aren't going to be an issue but I'd like to make sure they're not since this is an important decision and what we gain from implementing it is not so high in my opinion when compared with the option of deciding the syntax at the level of the page or the whole wiki. Actually if there are performance issues it might even be possible to combine both: * the page is edited in the default syntax (the one configured at the page level or wiki level) * there's an export option to export in a different syntax, same as exporting in PDF, RTF, etc. Thanks -Vincent

Cheers Stéphane

Sergiu Dumitriu

5:14 p.m.

Sorry for the late reply, I didn't see your mail.

...

"HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML" I don't understand this. What does manually entered HTML have to do with wiki parsing?

Because in wiki content user can introduce HTML.

Yes, but HTML should be left as-is. I don't see why it should be parsed. Do you have a use-case for parsing HTML together with the wiki syntax into a DOM?

...

To summarize we need to decide on the topic of whether we want to display wiki content in the user's preferred syntax or not. To be honest I also like that a lot since this is not something you find in other wikis but I'm worried about 2 things: A) feasibility. Aren't there always going to be lots of incompatibilities? Macros can be generic and work for all syntaxes so that's not an issue but what about links for example. XWiki's link syntax is richer than most other wiki's link syntax. For example if there's a reference to a another xwiki db in the link (otherwiki:SomeDocument) then what's going to happen when viewed with, say, a mediawiki syntax?

We either have to extend all the syntaxes, or to restrict the users only to use the common subset among all syntaxes. Or something in the middle, extend where possible, but trim all the things that don't have an equivalent in one of the syntaxes. The best thing would be to try to extend all syntaxes.

...

B) Complexity. Every user is going to be using his favorite syntax and thus when users talk together, copy/paste snippets on xwiki.org for example, they're all going to be in different syntaxes.

C) Included documents. What syntax will they use? #include will copy the content, and let radeox process it later, along with the includer document. We can put the included document inside a {syntax:$idoc.syntax} block. D) A common practice was to use velocity to generate wiki syntax that Radeox would process, or to generate (radeox) macro parameters using velocity, like the {rss:${userobj.feed}}. What happens if we dynamically change the wiki syntax? Velocity can't know about that. And with such fragmented code, it will be very hard to dynamically change the syntax to the current user's preference.

...

What happens after the export? The user edits that version in an offline tool (XEclipse, for example), then he can reimport the changed document, which will be converted back to the original syntax. I'd rather have a "convert" button, which will try to convert the document to another syntax. If there are things that can't be converted, then warn about this, and offer a Yes/No choice to the user, allowing him to force the converted syntax, or abandon the conversion. Hm, talking about XEclipse, maybe we can leave the "edit using XYZ syntax" as an XEclipse feature, not present in XWiki platform. This way we'll remove the stress from the server, as the conversion could be performed on the client.

...

Thanks -Vincent

And a question on wikimodel, what if there's a feature we need but doesn't have an equivalent in the WikiDOM? Sergiu -- http://purl.org/net/sergiu

Mikhail Kotelnikov

11:05 p.m.

Hello! I tried to response to your questions below from the point of view of a WikiModel developer :-) On 9/20/07, Sergiu Dumitriu <sergiu.dumitriu(a)gmail.com> wrote:

...

Sorry for the late reply, I didn't see your mail.

"HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML" I don't understand this. What does manually entered HTML have to do with wiki parsing?

Because in wiki content user can introduce HTML.

Yes, but HTML should be left as-is. I don't see why it should be parsed. Do you have a use-case for parsing HTML together with the wiki syntax into a DOM?

If you put your HTML in a wiki page it *can* be left as is without any modifications if it is in a "verbatim" block. In this case it is up to the client code to interpret or not the content of such verbatim blocks. It can be something like: {{{html: <h1>Hello, world</h1>}}} In this case the content of such a block can be directly inserted in the resulting generated HTML page. In general verbatim blocks can be used to insert in pages all what you want to interpret yourself. It can be groovy scripts: {{{groovy: print "Hello, world!"}}}. We need to parse HTML to transform any random web page to editable wiki content. BTW this code already exists and it works.

...

The main goal of multiple parsers for multiple wiki syntaxes is a possibility to import without loosing of information or formatting any external wiki pages and transform them to the WikiModel. If you wrote the same sequence of a titles, paragraphs, tables and lists formatted using XWiki, JSPWiki or MediaWiki syntaxes then they will give exactly the same structure in the WIkiModel. If pages are simple (they don't contain any "advanced" stuff like hierarchical embedded documents or properties) then they *can* be serialized and edited using any particular simple wiki syntax (XWiki, JSPWiki, Creole, ...). WikiModel guaranties that any modifications introduced using these particular syntaxes will not be loosed. If you loose something then it should be considered as a bug.

...

B) Complexity. Every user is going to be using his favorite syntax > and thus when users talk together, copy/paste snippets on xwiki.org > for example, they're all going to be in different syntaxes. >

It does not matter what the syntax you use for editing of your document. All these syntaxes will produce the *same* structure. Using this structure it is possible to serialize documents in the CommonSyntax. The cycle of editing is: - Create and submit a new text using, for example, the JSPWiki syntax - Parse the content using the JspWikiParser. This operation will produce a well-formed sequence of events for the listener (like: beginParagraph(..)/endParagraph(...); beginTable(...)/endTable(...)...). This step cleans up all user's errors like non-closed syntactic elements and so on. - Using the CommonSyntaxSerializer a new wiki document will be generated with exactly the same structure as the original JSPWiki document - This resulting document should be stored in the DB. In WikiModel a document written using a particular syntax is just a reflection of the internal structure of this document. Each particular wiki syntax (JSPWiki, XWiki, Creole, ...) reflects only part of possible structural elements of the WikiModel. The CommonSyntax is the "native" syntax and it contains *all* possible elements available in the WikiModel (embedded documents, properties, extensions, ...). And it was designed taking into account availability and facility of usage of formatting symbols with various keyboard layouts. Why it is important? Why do we need the CommonSyntax? Just some examples: Ex1: Using the CommonSyntax it is possible to put 2 paragraphs, a list and a table into an another table. It can be done because there is a notion of "embedded document". AFAIK no other syntaxes give this possibilities. Even MediaWiki which have the most advanced (and most complicated) syntax. Ex2: In a page containing the information about a person it is possible to define properties like "firstName", "lastName", "birthDate", "address" and so. So the document itself contains well structured semantic information as well as a normal text. In the future it can (and I think - it should) replace the notion of XWiki "objects" attached to documents. Ex3: The symbol "|" does not exist in the Russian keyboard layout. To enter this symbol you have to switch from Russian to English. Imagine now that you want to create a table with 5 columns and 5 lines. How much times you have to switch? :-) So I use the sequence "::" as table cell delimiters (but "|" is recognized as well). Table cell delimiters are just one example. The same with many other structural elements. C) Included documents. What syntax will they use? #include will copy

...

the content, and let radeox process it later, along with the includer document. We can put the included document inside a {syntax:$idoc.syntax} block.

As I said above WikiModel has a notion of "embedded documents". In WikiModel each wiki document is constructed from a sequence of block elements (headers, tables, lists, paragraphs,...). And block elements can have "embedded documents" which have exactly the same structural elements as the topmost one. An example of a page with an embedded document (CommonSyntax): ---------------------------------------------- = Example1 = The table below contains an embedded document. Using such embedded documents you can insert table in a list or a list in a table. And embedded documents can contain their own embedded documents!!! !! Header 1.1 !! Header 1.2 :: Cell 2.1 :: Cell 2.2 with an embedded document: ((( == This is an embedded document! == * list item one * list item two * sub-item A * sub-item B * list item three ))) :: Cell 3.1 :: Cell 3.2 This is a paragraphs after the table... ---------------------------------------------- Please note that these "embedded documents" have nothing to do with external document inclusions. WikiModel works only with the content of one page. Its goal is just to recognize and manipulate with individual structural elements on wiki pages. If you want to make inclusions you can use "extensions" and interpret them in your code as you wish. It can be something like that: ---------------------------------------------- = Example2 = The text below will be recognized by the WikiModel as an "extension" and it can be interpreted in the user's code for example to include an external page in this place. $include(http://www.google.com) The next paragraph... ---------------------------------------------- D) A common practice was to use velocity to generate wiki syntax that

...

Radeox would process, or to generate (radeox) macro parameters using velocity, like the {rss:${userobj.feed}}. What happens if we dynamically change the wiki syntax? Velocity can't know about that. And with such fragmented code, it will be very hard to dynamically change the syntax to the current user's preference.

If I understand well then the decision was to use the WikiModel instead of Radeox. WikiModel does exactly the same as Radeox does. With some differences: - WikiModel guaranties that documents are well-formed. It is based on real grammars for JavaCC and not on regular expressions, like Radeox. - WIkiModel contains parsers for multiple syntaxes (CommonSyntax, XWiki, JspWiki, Mediawiki, Creole, ...) - WikiModel does not generate HTML; it just notify listeners about individual structural elements found in a document; And it is up to the implementors of these listeners to do something. For example - there is a listener which generates an HTML. An another can generate a wiki page with another wiki syntax. And so on... So if you want to include an external document you can extend the HTML Listener, overload the method onExtension(String extensionContent) and make this inclusion operation. In CommonSyntax extensions are defined as following: ---------------------------------------------- = Example3 = This is an {{{rss: ${userobj.feed} }}} ---------------------------------------------- About Velocity... Personally I think that it is better NOT to use Velocity at all and to use Groovy templates instead.

...

Maybe these 2 points aren't going to be an issue but I'd like to make

sure they're not since this is an important decision and what we gain from implementing it is not so high in my opinion when compared with the option of deciding the syntax at the level of the page or the whole wiki. Actually if there are performance issues it might even be possible to combine both: * the page is edited in the default syntax (the one configured at the page level or wiki level) * there's an export option to export in a different syntax, same as exporting in PDF, RTF, etc.

Hm, talking about XEclipse, maybe we can leave the "edit using XYZ

...

syntax" as an XEclipse feature, not present in XWiki platform. This way we'll remove the stress from the server, as the conversion could be performed on the client.

You can loose some information only when you transform WikiModel-specific structural elements (like embedded documents or properties) into an external format (XWiki, JSpWiki, ...). When you import from other format to WikiModel you should loose nothing. Otherwise it is considered as a bug in the implementation of WikiModel's parsers . So if you exported a wiki page to particular syntax without warnings you can be sure that all your modifications will be seamlessly integrated back.

...

Thanks

-Vincent

And a question on wikimodel, what if there's a feature we need but doesn't have an equivalent in the WikiDOM?

Hmm... - WikiModel works with a super-set of structural elements available in existing wikis (in those wikis which I know :-)) and it contains additional features like embedded documents or properties. So if you found a structural element existing in other wikis and not presented in WikiModel (and which can not be *easly* simulated with existing elements) then you should consider it as a bug. And such a structural element should be added to the WikiModel as soon as possible. - If you need some additional features and they can not be "externalized" in verbatim blocks then... write me. We will discuss :-) I think that the WikiModel can give the common infrastructure which works with well-known elements. If you need something specific - just put it in a verbatim block and interpret it yourself in your code. ---------------------------------------------- = Example4 = This is a verbatim block: {{{ This is a verbatim block. It can be used to insert in the final page a <strong>junk and <em>bad-formed</strong> html</em>!!! }}} And the next block can be interpreted in your code as a groovy script: {{{groovy: println "Hello, world!" }}} ---------------------------------------------- WikiModel is written using JavaCC grammars. Modifications of these grammars is not a very complicated task but it is definitely requires more work than just changing of configuration files. And about the WikiModel DOM... As I wrote above, the last version of the WikiModel does not contain DOM yet. Just the common infrastructure and a set of parsers for various wiki syntaxes generating well-formed events for structural elements. Best regards, Mikhail Sergiu

...

-- http://purl.org/net/sergiu _______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs

Vincent Massol

21 Sep 21 Sep

9:17 a.m.

On Sep 20, 2007, at 5:14 PM, Sergiu Dumitriu wrote:

...

Sorry for the late reply, I didn't see your mail.

"HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML" I don't understand this. What does manually entered HTML have to do with wiki parsing?

Because in wiki content user can introduce HTML.

Yes, but HTML should be left as-is. I don't see why it should be parsed. Do you have a use-case for parsing HTML together with the wiki syntax into a DOM?

I wrote this before I understood that they would be inserted in a verbatim block. So yes you're right, there's no point to parse them for the main view use case. [snip] Thanks -Vincent

Mikhail Kotelnikov

19 Sep 19 Sep

6:54 p.m.

...

Yes. 2) Document contents are stored in the database in textual format in

...

the main xwiki syntax (whatever we decide it is - we could standardize on creole for example)

It can be the "Common Syntax" for the reasons mentioned above :-). Creole syntax is one of the most restrictive syntaxes. And I tried to uses in the CommonSyntax as much markups of the Creole as possible. An another possibility is to store directly in XML or in XHTML+microformat enhancements (for additional structural elements). pro: - it can be exported/imported directly and used by external applications which knows nothing about wikis; just a standard XML or XHTML - this content can be transformed with XSLT processors directly without usage of the WikiModel - it can be faster to parse XML than the CommonWiki syntax (I have no comparisons) con: - it is more difficult to work with diffs (but for diffs it is *better* to use WkiModel and to generate a specific wiki syntax; for example "Common syntax"); - it is not a "human readable" format; it is difficult to understand what you loads from the DB 3) Use case 1: Viewing a document a) Get the doc from the DB --> text1 (xwiki text format)

...

b) Apply TextProcessors --> text2 c) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an internal DOM) d) Apply DomProcessors --> DOM2 e) Call the required Renderer --> PDF, XML, HTML, RTF, text, etc

Yes. 4) Use case 2: Editing a document, assuming the user wants to use the

...

MediaWiki syntax for editing a) Get the doc from the DB --> text1 (xwiki text format) b) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an internal DOM) c) Call MediaWikiRenderer --> text2 (text in MediaWiki format) d) the user edits and hits save e) MediaWikiParser --> DOM2 (transforms MediaWiki text syntax into the internal DOM) f) Call XWikiRenderer --> text" (transforms DOM into xwiki textual format) g) Save text3 in the database

Yes. (text1 and text3 can be XML, as I said above) 5) In practice this means the following classes:

...

* TextProcessorManager: to chain several text processors

Yes. But it can be just a composite processor implementing the same ProcessorManager interfaces. * TextProcessor

...

- VelocityTextProcessor - GroovyTextProcessor

Yes. * WikiParser: Takes wiki syntax and generates a DOM in a XWiki-

...

specific format (independent of the different wiki syntaxes). - LegacyXWikiWikiParser - XWikiWikiParser (or simply use CreoleWikiParser if we want our internal format to be Creole) - ConfluenceWikiParser - MediaWikiWikiParser - JSPWikiWikiParser - CreoleWikiParser - HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML. So this HTMLParser is probably a parent of the other parsers in some regard. Anyway we need this one for the WYSIWYG editor which may need to transform HTML to wiki syntax (so we may need a XWikiDomProcessor too to transform into XWiki syntax). The alternative (much better) is to have the WYSIWYG editor only use the internal XWiki-specific DOM format for all its manipulations.

If you want, you can put HTML as a non-interpreted block ("verbatim blocks") and interpret it in the client code. But internally the WikiModel does not support "embedded" (X)HTML. The main reason: in this case I loose control of the document structure. And this control is the main goal of the WikiModel. * DomProcessorManager: to chain several DOM processors

...

* DomProcessor - Don't know yet what we're going to use this for. TOCDomProcessor as you say above maybe.

DOMProcessor can be used to transform the original DOM object representing the document in the DB into a new (user and query-specific) DOM object which can contain new elements, generated dynamically. Now all dynamic page elements are interpreted as simple Velocity or Groovy scripts and they generate text documents which should be parsed using Radeox and transformed to the final HTML document. Using the DOM representation it is possible to interpret some nodes of this graph as Groovy scripts. In WikiModel they will correspond to Verbatim blocks which are opaque for WikiModel but they can be interpreted as scripts by the DomProcessor(s). And these "Groovy"-nodes can be executed and they will add new DOM elements to the DOM2. For example this approach can be used to generate search results. The advantages of this approach: - You can put your parsed document DOM1 in the cache, which will avoid you to to parse the document for each query. It is a slowest step in the page processing. Even if the current version of WikiModel is faster than before and it should be faster than Radeox processor. - Your Groovy scripts will manipulate with normal java classes (DOM nodes) and it will produce DOM nodes and not a plain text. It seems especially interesting taking into account Groovy's Builders ( http://groovy.codehaus.org/Builders). It is enough to write a very simple builder (see http://groovy.codehaus.org/BuilderSupport) generating DOM nodes and ... voila! Your Groovy node from a wiki page generates search results as DOM nodes! These manipulations with DOM objects should be MUCH faster that process plain text for every request. And all following steps are fast as well - to generate an HTML page it is enough to visit all node with an "XHTMLVisitor". BTW: do you need Velocity at all? Using only Groovy is much cleaner. It can be used as THE language of XWiki. It can be used as a template *and* programming language at the same time. And if you *really* want it is possible to integrate Jasper (from Tomcat) engine to use it for pure templating. The code from Jetty (the org.mortbay.jetty.jspc.plugin package) can be used as an example of integration with Jasper (see http://jetty.mortbay.org/xref/index.html). In this case in templates it will be possible to use: - JSP tag libraries (including standard ones) - Multiple scripting languages (like javabeans, javascript, jpython, jruby, groovy,...) * Renderer

...

- XMLRenderer - HTMLRenderer - PDFRenderer - RTFRenderer - XWikiRenderer (or simply use CreoleRenderer if we want our internal format to be Creole) - ConfluenceRenderer - MediaWikiRenderer - JSPWikiRenderer - CreoleRenderer

Yes. All these renderers should be written if you want to support all these syntaxes. I think that it should not be very difficult. WDYT? Do I have it right? :) Best regards, Mikhail Thanks

...

-Vincent On Sep 13, 2007, at 6:37 PM, StÃ(c)phane LauriÃ¨re wrote:

Hi Vincent, hi everyone, We discussed the WikiModel integration with Mikhail this afternoon. Here is below our input. Vincent Massol wrote:

the WikiModel parser recognizes scripts indeed. Mikhail and StÃ(c)phane > > Use cases > ======== > > * View page > o ViewAction -- template -> > ScriptingEngineManager.evaluate > () -- wiki syntax -> RenderingEngineManager.render() ---> HTML, XML, > PDF, RTF, etc > * Edit page in WYSIWYG editor > o Uses the WikiParser to create a "DOM" of the page > content and to render it accordingly. NOTE: This is required since > rendering in the WYSIWYG editor is different from the final > rendering. For example, macros need to be shown in a special way to > make them visible, etc. > o Changes done by the user are entered in HTML. Note: it > would be better to capture them so that they are entered in the > "DOM". Is that possible? If not, then the HTMLParser is used to > convert from HTML to Wiki Syntax but they're likely be some loss in > the conversion. The advantage is the ability to take any HTML content > and generate wiki syntax from it. > > > This is my very earlier thinking but I wanted to make it visible to > give everyone the change to 1) know what's happening and 2) suggest > ideas. > > I'll refine this in the coming days and post again on this thread. > > Thanks > -Vincent

Vincent Massol

21 Sep 21 Sep

9:30 a.m.

Hi Mikhail, Thanks for sharing this info with us! This makes it more clear for me. From what I understand below you're recommending to eliminate the need for TextProcessor and instead do the following: * Store the documents in the database in the DOM format (XML) * Store scripts as a verbatim block in that DOM * Only use DOMProcessor to make transformations to the DOM. For example have a VelocityDomProcessor and GroovyDomProcessor to modify the script DOM elements and evaluate them. Note: Velocity or Groovy scripts can generate wiki syntax content and thus these would need to generate new DOM elements. Not sure how easy that would be. This means the VelocityDomProcessor would need internally to use a Parser to parse the result of the evaluation and generate a sub-DOM. Is that correct? Thus the textual format would only be used when the user enters text or when we want to export the content. The main advantage would be performance since there'll be no need to go back and forth between textual format and DOM format. Makes sense to me. Now you mention removing Velocity. This won't be possible since all current XWiki instances used are using Velocity and we cannot tell our users that they have to rewrite all their pages if they want to move to XWiki v1.3. We'll need to continue supporting Velocity for some time. Personally I currently find that the velocity syntaxes mixes much better with the wiki syntax than groovy. If you look at contributed code snippets you'll see that most are in Velocity which is what most people use. Now you mention other stuff about Jasper and Jetty but I'm not sure I have understood that part. Thanks -Vincent See below. On Sep 19, 2007, at 6:54 PM, Mikhail Kotelnikov wrote:

...

Hi Vincent, hi everyone, We discussed the WikiModel integration with Mikhail this afternoon. Here is below our input. Vincent Massol wrote: > Hi, > > I've started working on designing the new Rendering/Parsing > components and API for XWiki. The implementation will be based on > WikiModel but we need some XWiki wrapping interfaces around it.

Note

> that this is a prerequisite for the new WYSIWYG editor based on

GWT

> (see http://www.xwiki.org/xwiki/bin/view/Design/ > NewWysiwygEditorBasedOnGwt). > > I've updated http://www.xwiki.org/xwiki/bin/view/Design/ > WikiModelIntegration with the information below, which I'm pasting > here so that we can have a discussion about it. I'll consolidate

the

results on that wiki page. Componentize the Parsing/Rendering APIs ================================== We need 4 main components: * A Scripting component to manage scripting inside XWiki documents and to evaluate them.

(example:

table of content generator), this results in dom2 - dom2 is made to available as such or is converted to XML, HTML,

PDF

etc. depending on the user request TextProcessor and DomProcessor would have the following interfaces: TextProcessor - String execute(String content) DomProcessor - DOM execute(DOM content) That means we should have a syntax to distinguish between scripts

that

generate text content, and scripts that manipulate the DOM. > * A Rendering component to manage rendering Wiki syntax into > HTML and other (PDF, RTF, etc) > * A Wiki Parser component to offer a typed interface to XWiki > content so that it can be manipulated > * A HTML Parser component (for the WYSIWYG editor) > > Different Syntaxes =============== > > Two possible solutions: > > 1. Have a WikiSyntax Object (A simple class with one

property: a

> combox box with different syntaxes: XWiki Legacy, Creole,

MediaWiki,

> Confluence, JSPWiki, etc) that users can attach to pages to tell

the

> Renderers what syntax is used. If no such object is attached then > it'll default to XWiki's default syntax (XWiki Legacy or Creole for > example). > 2. Have some special syntax, independent of the wiki

syntaxes to

> tell the Rendered that such block of content should be rendered

with

that given syntax. Again there would be a default.

Here's our view regarding the syntax used in wiki edit mode:

document

requested for edition are available from the database in a

serialized

format, for instance XHTML. When entering into the edit action, the user indicates his preferred syntax. If the text of the requested

document

contains some blocks that are not handled by the chosen syntax, the user gets a warning (example: the document contains a table as a list

item,

and the user tries to edit the document using JSPWiki syntax). If

not,

WikiModel converts the serialized format into a DOM, the user edits the DOM and the WikiModel serializer serializes it back when the user saves it. Note that the DOM representation of wiki documents in the latest version of WikiModel is still pending.

WikiModel generates events for blocks that are not to be parsed (typically because they contain scripts). For example, in the WikiModel syntax currently called

"CommonSyntax",

this looks like the following: ============== {{{macro:mymacro (String parameters) dothis dothat }}} $mymacro(parameters) ============== For each syntax, macro blocks are identified as far as possible (we still have to check it's the case for all types of macro blocks inde indeed).

the WikiModel parser recognizes scripts indeed. Mikhail and StÃ©phane > > Use cases > ======== > > * View page > o ViewAction -- template -> > ScriptingEngineManager.evaluate > () -- wiki syntax -> RenderingEngineManager.render() ---> HTML,

XML,

> PDF, RTF, etc > * Edit page in WYSIWYG editor > o Uses the WikiParser to create a "DOM" of the page > content and to render it accordingly. NOTE: This is required since > rendering in the WYSIWYG editor is different from the final > rendering. For example, macros need to be shown in a special way to > make them visible, etc. > o Changes done by the user are entered in HTML. Note: it > would be better to capture them so that they are entered in the > "DOM". Is that possible? If not, then the HTMLParser is used to > convert from HTML to Wiki Syntax but they're likely be some loss in > the conversion. The advantage is the ability to take any HTML

content >> and generate wiki syntax from it. >> >> >> This is my very earlier thinking but I wanted to make it visible to >> give everyone the change to 1) know what's happening and 2) suggest >> ideas. >> >> I'll refine this in the coming days and post again on this thread. >>

Mikhail Kotelnikov

12:03 p.m.

Hi! On 9/21/07, Vincent Massol <vincent(a)massol.net> wrote:

...

Exactly. It should be much faster and these scripts will work with typed objects (WikiDomNodes) and not with a plain text. Note: Velocity or Groovy scripts can generate wiki syntax content and thus

...

these would need to generate new DOM elements. Not sure how easy that would be.

I think that with Groovy it can be even simper to work nodes than with Velocity. An example: ------------------------------------- // Velocity: ------------------------------------- <div

...

<h1>Hello $customer.Name!</h1

...

<table

...

#foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) <tr

...

<td

...

$flogger.getPromo( $mud ) </td

</tr

#end #end </table

</div

------------------------------------- // Groovy: // (see http://groovy.codehaus.org/GroovyMarkup, // http://groovy.codehaus.org/Builders) ------------------------------------- def xml = new MarkupBuilder() xml.div() { h1("Hello, ${$customer.Name}!") table() { for (mud in mudsOnSpecial) { if (customer.hasPurchased(mud)) { tr(){ td( flogger.getPromo(mud) ) } } } } } -------------------------------------

...

From my point of view the second example is even simpler then the

Velocity-based one. The advantages of the Groovy-based stuff: - It generates a well-formed HTML (you have no choice, it is done automatically :-)) - It can be compiled directly to the Java bytecode and cached - It is much more powerful then Velocity - You don't have to learn 2 stuff at the same time - Velocity and Groovy - It is possible to write WikiModel specific builders which will be much more efficient then the generic MarkupBuilder from the example above. It will be possible to manipulate with, say, tables in the following way: table[i][j] = "Hello!"; I think that in any way you have to be a geek to write a template :-). You have to understand at least some notions like "variables", "if" conditions, "for" cycles and so on. And IMHO it is simpler to use these structures in a normal programming language. And if you are a "normal" user then for you even Velocity templates are completely unreadable. Another aspect is that if you have errors in your Velocity template it don't save you from exceptions. It it doesn' work in the same way as a bad-written groovy code :-). This means the VelocityDomProcessor would need internally to use a Parser to

...

parse the result of the evaluation and generate a sub-DOM. Is that correct?

Exactly. Thus the textual format would only be used when the user enters text or when

...

we want to export the content.

Yes The main advantage would be performance since there'll be no need to go back

...

and forth between textual format and DOM format.

Not only. Other advantages: - Results are well-formed (you manipulate with objects and not with plain text). - Resulting graph can be serialized in *any* format (PDF, ps, XML, wap, ...); You don't have plain HTML, you control all aspects. Makes sense to me.

...

Fine. Now you mention removing Velocity. This won't be possible since all current > XWiki instances used are using Velocity and we cannot tell our users that > they have to rewrite all their pages if they want to move to XWiki v1.3. > We'll need to continue supporting Velocity for some time. Personally I > currently find that the velocity syntaxes mixes much better with the wiki > syntax than groovy. If you look at contributed code snippets you'll see that > most are in Velocity which is what most people use.

...

In any case if you change the syntax you will have to process your scripts as well. Maybe it would be possible to create a translator from Velocity to Groovy: - Take a page containing a velocity template - Parse it using Velocity and produce in-memory JST representation of this template (Velocity is JavaCC based it uses abstract syntax trees...) - Launch a visitor translating each individual structural element of this template to the corresponding Groovy code. (see http://svn.apache.org/viewvc/velocity/engine/branches/Velocity_1.5_BRANCH/s… ) About the usage of Groovy and templating... I don't like at all the groovy templates where the syntax like <% if (...){%>Hello<%}%> is used. It is "inspired" by bad-styled JSP/ASP/PHP... But I really like Groovy builders, as I wrote above. And I have impression that these builders can easily replace Velocity (or other templates). Now you mention other stuff about Jasper and Jetty but I'm not sure I have > understood that part.

...

I thought that it would be possible to use the JSP syntax to create templates. So each wiki page can be considered as a jsp page (maybe - if it contains some specific markers in the content) and it can be parsed and compiled as a JSP. The advantage: - And, especially, usage of standard tag libraries (like <c:forEach items="${addresses}" var="address">...</c:forEach>) The standard tag libraries gives the same functionalities (and much more) than Velocity. The advantages: it is a standard, it can be compiled directly in java, ... - Usage of multiple languages (javascript, jpython, jruby, groovy...) if you use the syntax like "<% if (...) {%>Hello!<%}%> (I hate this coding style!) Personally I would like to see the Groovy templating much more than JSP-based one. It was just a proposal... Best regards, M Thanks > -Vincent

...

> See below.

...

> On Sep 19, 2007, at 6:54 PM, Mikhail Kotelnikov wrote:

...

> Hello!

...

> Just some words about what the wiki model is and what it is not.

...

> The main goal of the WikiModel is the creation of an API giving access and > control to the internal structure of individual wiki documents.

...

> Some features of the WikiModel: > - WikiModel itself does not depend on any particular wiki syntax > - The number of possible structural elements and their possible assembling > order is strictly fixed (which greatly simplifies the validation and > manipulation) but the final result is almost as expressive as XHTML (and > even more expressive, taking into account notions of properties and embedded > documents which can recursively contain their own embedded documents :-)). > - WikiModel manipulates with a super-set of structural elements available > in existing wikis. And it has some features not available in other wikis. > For example using embedded documents in WikiModel it is possible to put a > table in a list and this table can contains its own headers, paragraphs, and > lists... Or using embedded documents with the notion of properties it is > possible to define very complex structured objects directly on a wiki page. > - There is at least one wiki syntax ("Common" syntax) giving access to all > features of the Wiki Model. This syntax guaranties that all structural > elements of the WikiModel can be serialized/de-serialized without loose of > information and structure. Using any other syntaxes can lead to the > information lost (example: you can not put table in a table in XWiki or in > JSPWiki which is possible using the Common Syntax). > - One of the goals of the WikiModel is to give a mean to *import* > information from various wiki engines without information lost. The > structure of documents can be serialized in various wiki syntaxes as well, > but there is no guaranties that some information will not be lost. The > information can be lost in the case when a document contains some elements > which have no representation in a particular wiki syntax. Example: > properties; tables in lists; parameters of lists, paragraphs, and tables > and so on... > - All elements managed by the WikiModel can be serialized/deserialized > using XHTML with additional annotations (microformat-like annotations)

...

> Some features of the CommonSyntax: > - It is a native syntax for the WikiModel. It provides full access to all > features of the WikiModel. All structures in the WikiModel can be > serizlized/deszerialized in this syntax without any information lost > - It uses markup characters available in most (in ideal situation - in > all) keyboard layouts (including Russian :-)). So you don't have to switch > keyboard layouts to write text, tables, lists and headers. For example > tables can be defined using pipe symbols ("|" - which is not available in > many keyboard layouts) or the "::" sequence. > - If there is a choice then the most commonly used markups are used

...

> The current version of the WikiModel provides just an event-based > interface to work with the structure of documents (like SAX for XML). > In previous versions of WIkiModel I had Document Object Model in which > each structural element had its own object representation. In the current > version an Object Model is not implemented (yet). I thought to create just a > set of utility classes manipulating with the standard XML DOM. Example: the > method WikiTable#setCellContent(int row, int column, String content) should > create an XHTML table object, create the required number of cells and > columns and put the given string content in this node. The same for all > other structural elements (headers, lists, internal documents, properties, > styles, macros...)

...

> On 9/14/07, Vincent Massol <vincent(a)massol.net > wrote: >

...

> > +1 to all that. So let me summarizes and rephrase to see if I have > > understood :) >

...

> > 1) We have 4 types of objects: > > * TextProcessors: take text and generate text > > * Parsers: take text and generate an internal DOM format (pivot format) > > * DomProcessors: take DOM and generate DOM > > * Renderers: take DOM and generate anything (text, PDF, RTF, HTML, > > XML, etc)

...

> Yes.

...

> 2) Document contents are stored in the database in textual format in > > the main xwiki syntax (whatever we decide it is - we could > > standardize on creole for example)

...

> It can be the "Common Syntax" for the reasons mentioned above :-). Creole > syntax is one of the most restrictive syntaxes. And I tried to uses in the > CommonSyntax as much markups of the Creole as possible.

...

> An another possibility is to store directly in XML or in XHTML+microformat > enhancements (for additional structural elements). > pro: > - it can be exported/imported directly and used by external applications > which knows nothing about wikis; just a standard XML or XHTML > - this content can be transformed with XSLT processors directly without > usage of the WikiModel > - it can be faster to parse XML than the CommonWiki syntax (I have no > comparisons) > con: > - it is more difficult to work with diffs (but for diffs it is *better* to > use WkiModel and to generate a specific wiki syntax; for example "Common > syntax"); > - it is not a "human readable" format; it is difficult to understand what > you loads from the DB

...

> 3) Use case 1: Viewing a document

...

> a) Get the doc from the DB --> text1 (xwiki text format) > > b) Apply TextProcessors --> text2 > > c) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an > > internal DOM) > > d) Apply DomProcessors --> DOM2 > > e) Call the required Renderer --> PDF, XML, HTML, RTF, text, etc

...

> Yes.

...

> 4) Use case 2: Editing a document, assuming the user wants to use the > > MediaWiki syntax for editing >

...

> > a) Get the doc from the DB --> text1 (xwiki text format) > > b) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an > > internal DOM) > > c) Call MediaWikiRenderer --> text2 (text in MediaWiki format) > > d) the user edits and hits save > > e) MediaWikiParser --> DOM2 (transforms MediaWiki text syntax into > > the internal DOM) > > f) Call XWikiRenderer --> text" (transforms DOM into xwiki textual > > format) > > g) Save text3 in the database

...

> Yes. (text1 and text3 can be XML, as I said above)

...

> 5) In practice this means the following classes: >

...

> > * TextProcessorManager: to chain several text processors

...

> Yes. But it can be just a composite processor implementing the same > ProcessorManager interfaces.

...

> * TextProcessor > > - VelocityTextProcessor > > - GroovyTextProcessor

...

> Yes.

...

> * WikiParser: Takes wiki syntax and generates a DOM in a XWiki- > > specific format (independent of the different wiki syntaxes). > > - LegacyXWikiWikiParser > > - XWikiWikiParser (or simply use CreoleWikiParser if we want our > > internal format to be Creole) > > - ConfluenceWikiParser > > - MediaWikiWikiParser > > - JSPWikiWikiParser > > - CreoleWikiParser > > - HTMLParser: I think all parsers above need to support HTML since > > the wiki syntaxes can be mixed with HTML. So this HTMLParser is > > probably a parent of the other parsers in some regard. Anyway we need > > this one for the WYSIWYG editor which may need to transform HTML to > > wiki syntax (so we may need a XWikiDomProcessor too to transform into > > XWiki syntax). The alternative (much better) is to have the WYSIWYG > > editor only use the internal XWiki-specific DOM format for all its > > manipulations.

...

> If you want, you can put HTML as a non-interpreted block ("verbatim > blocks") and interpret it in the client code. But internally the WikiModel > does not support "embedded" (X)HTML. The main reason: in this case I loose > control of the document structure. And this control is the main goal of the > WikiModel.

...

> * DomProcessorManager: to chain several DOM processors > > * DomProcessor > > - Don't know yet what we're going to use this for. TOCDomProcessor > > as you say above maybe.

...

> DOMProcessor can be used to transform the original DOM object representing > the document in the DB into a new (user and query-specific) DOM object which > can contain new elements, generated dynamically. Now all dynamic page > elements are interpreted as simple Velocity or Groovy scripts and they > generate text documents which should be parsed using Radeox and transformed > to the final HTML document. Using the DOM representation it is possible to > interpret some nodes of this graph as Groovy scripts. In WikiModel they will > correspond to Verbatim blocks which are opaque for WikiModel but they can be > interpreted as scripts by the DomProcessor(s). And these "Groovy"-nodes can > be executed and they will add new DOM elements to the DOM2. For example this > approach can be used to generate search results.

...

> The advantages of this approach: > - You can put your parsed document DOM1 in the cache, which will avoid you > to to parse the document for each query. It is a slowest step in the page > processing. Even if the current version of WikiModel is faster than before > and it should be faster than Radeox processor. > - Your Groovy scripts will manipulate with normal java classes (DOM nodes) > and it will produce DOM nodes and not a plain text. It seems especially > interesting taking into account Groovy's Builders (http://groovy.codehaus.org/Builders). > It is enough to write a very simple builder (see > http://groovy.codehaus.org/BuilderSupport) generating DOM nodes and ... > voila! Your Groovy node from a wiki page generates search results as DOM > nodes! These manipulations with DOM objects should be MUCH faster that > process plain text for every request. And all following steps are fast as > well - to generate an HTML page it is enough to visit all node with an > "XHTMLVisitor".

...

> BTW: do you need Velocity at all? Using only Groovy is much cleaner. It > can be used as THE language of XWiki. It can be used as a template *and* > programming language at the same time. And if you *really* want it is > possible to integrate Jasper (from Tomcat) engine to use it for pure > templating. The code from Jetty (th e org.mortbay.jetty.jspc.plugin package) > can be used as an example of integration with Jasper (see > http://jetty.mortbay.org/xref/index.html). > In this case in templates it will be possible to use: > - JSP tag libraries (including standard ones) > - Multiple scripting languages (like javabeans, javascript, jpython, > jruby, groovy,...)

...

> * Renderer > > - XMLRenderer > > - HTMLRenderer > > - PDFRenderer > > - RTFRenderer > > - XWikiRenderer (or simply use CreoleRenderer if we want our > > internal format to be Creole) > > - ConfluenceRenderer > > - MediaWikiRenderer > > - JSPWikiRenderer > > - CreoleRenderer

...

> Yes. All these renderers should be written if you want to support all > these syntaxes. I think that it should not be very difficult.

...

> WDYT? Do I have it right? :)

...

> Best regards, > Mikhail

...

> Thanks > > -Vincent >

...

> > On Sep 13, 2007, at 6:37 PM, StÃ(c)phane LauriÃ¨re wrote: >

...

> > > Hi Vincent, hi everyone, > >

...

> > > We discussed the WikiModel integration with Mikhail this afternoon. > > > Here > > > is below our input. > >

...

> > > Vincent Massol wrote: > > >> Hi, > > >

...

> > >> I've started working on designing the new Rendering/Parsing > > >> components and API for XWiki. The implementation will be based on > > >> WikiModel but we need some XWiki wrapping interfaces around it. Note > > >> that this is a prerequisite for the new WYSIWYG editor based on GWT > > >> (see http://www.xwiki.org/xwiki/bin/view/Design/ > > >> NewWysiwygEditorBasedOnGwt). > > >

...

> > >> I've updated http://www.xwiki.org/xwiki/bin/view/Design/ > > >> WikiModelIntegration with the information below, which I'm pasting > > >> here so that we can have a discussion about it. I'll consolidate the > > >> results on that wiki page. > > >

...

> > >> Componentize the Parsing/Rendering APIs > > >> ================================== > > >

...

> > >> We need 4 main components: > > >

...

> > >> * A Scripting component to manage scripting inside XWiki documents > > >> and to evaluate them. > >

...

> > > On the topic of scripting we would like to propose a distinction > > > between > > > scripts that act on text and scripts that act on the DOM. > > > Typically, the > > > text rendering processing for flow would be the following, for say > > > "text1": > >

...

> > > text1 =TextProcessor=> text2 =Parser=> dom1 =DomProcessor=> dom2 > > > => ... > >

...

> > > - the scripts contained in text1 are processed in the context of > > > user1, > > > this results into a new text: text2 > > > - the parser parses text2 and converts text2 to a DOM tree, dom1 > > > - dom1 is processed by scripts that work directly on the DOM (example: > > > table of content generator), this results in dom2 > > > - dom2 is made to available as such or is converted to XML, HTML, PDF > > > etc. depending on the user request > >

...

> > > TextProcessor and DomProcessor would have the following interfaces: > >

...

> > > TextProcessor > > > - String execute(String content) > >

...

> > > DomProcessor > > > - DOM execute(DOM content) > >

...

> > > That means we should have a syntax to distinguish between scripts that > > > generate text content, and scripts that manipulate the DOM. > >

...

> > >> * A Rendering component to manage rendering Wiki syntax into > > >> HTML and other (PDF, RTF, etc) > > >> * A Wiki Parser component to offer a typed interface to XWiki > > >> content so that it can be manipulated > > >> * A HTML Parser component (for the WYSIWYG editor) > > >

...

> > >> Different Syntaxes =============== > > >

...

> > >> Two possible solutions: > > >

...

> > >> 1. Have a WikiSyntax Object (A simple class with one property: a > > >> combox box with different syntaxes: XWiki Legacy, Creole, MediaWiki, > > >> Confluence, JSPWiki, etc) that users can attach to pages to tell the > > >> Renderers what syntax is used. If no such object is attached then > > >> it'll default to XWiki's default syntax (XWiki Legacy or Creole for > > >> example). > > >> 2. Have some special syntax, independent of the wiki syntaxes to > > >> tell the Rendered that such block of content should be rendered with > > >> that given syntax. Again there would be a default. > > >

...

> >

...

> > > Here's our view regarding the syntax used in wiki edit mode: document > > > requested for edition are available from the database in a serialized > > > format, for instance XHTML. When entering into the edit action, the > > > user > > > indicates his preferred syntax. If the text of the requested document > > > contains some blocks that are not handled by the chosen syntax, the > > > user > > > gets a warning (example: the document contains a table as a list item, >

...

> > > and the user tries to edit the document using JSPWiki syntax). If not, > > > WikiModel converts the serialized format into a DOM, the user edits > > > the > > > DOM and the WikiModel serializer serializes it back when the user > > > saves it. > >

...

> > > Note that the DOM representation of wiki documents in the latest > > > version > > > of WikiModel is still pending. > >

...

> > >

...

> > >> XWiki Interfaces > > >> ============= > > >

...

> > >> * ScriptingEngineManager: Manages the different Scripting > > >> Engines, calling them in turn. > > >> * ScriptingEngine > > >> o Method: evaluate(String content) > > >> o Implementation: VelocityScriptingEngine > > >> o Implementation: GroovyScriptingEngine > > >> * RenderingEngineManager: Manages the different Rendering > > >> Engines, calling them in turn. > > >> * RenderingEngine > > >> o Method: render(String content) > > >> o Implementation: XWikiLegacyRenderingEngine (current > > >> rendering engine) > > >> o Implementation: WikiModelRenderingEngine > > >> * Parser: content parsing > > >> o HTMLParser: parses HTML syntax > > >> o WikiParser: parses wiki syntax > > >> o Implementation: WikiModelHTMLParser > > >> o Implementation: WikiModelWikiParser > > >

...

> > >> Open Questions: > > >

...

> > >> * Does WikiModel support a generic syntax for macros? > >

...

> > > WikiModel generates events for blocks that are not to be parsed > > > (typically because they contain scripts). > >

...

> > > For example, in the WikiModel syntax currently called "CommonSyntax", > > > this looks like the following: > > > ============== > > > {{{macro:mymacro (String parameters) > > > dothis > > > dothat > >

...

> > > }}} > >

...

> >

...

> > > $mymacro(parameters) > > > ============== > >

...

> > > For each syntax, macro blocks are identified as far as possible (we > > > still have to check it's the case for all types of macro blocks inde > > > indeed). > >

...

> >

...

> > >> * Is the Rendering also in charge of generating PDF, RTF, > > >> XML, etc? > > >> o I think so, need to modify interfaces above to reflect > > >> this. > > >> * The WikiParser needs to recognizes scripts since this is > > >> needed for the WYSIWYG editor. > >

...

> > > the WikiModel parser recognizes scripts indeed. > >

...

> >

...

> > > Mikhail and StÃ(c)phane > >

...

> > >

...

> > >> Use cases > > >> ======== > > >

...

> > >> * View page > > >> o ViewAction -- template -

...

> > >> ScriptingEngineManager.evaluate > > >> () -- wiki syntax -> RenderingEngineManager.render() ---> HTML, XML, > > >> PDF, RTF, etc > > >> * Edit page in WYSIWYG editor > > >> o Uses the WikiParser to create a "DOM" of the page > > >> content and to render it accordingly. NOTE: This is required since > > >> rendering in the WYSIWYG editor is different from the final > > >> rendering. For example, macros need to be shown in a special way to > > >> make them visible, etc. > > >> o Changes done by the user are entered in HTML. Note: it > > >> would be better to capture them so that they are entered in the > > >> "DOM". Is that possible? If not, then the HTMLParser is used to > > >> convert from HTML to Wiki Syntax but they're likely be some loss in > > >> the conversion. The advantage is the ability to take any HTML content > > >> and generate wiki syntax from it. > > >

...

> > >

...

> > >> This is my very earlier thinking but I wanted to make it visible to > > >> give everyone the change to 1) know what's happening and 2) suggest > > >> ideas. > > >

...

> > >> I'll refine this in the coming days and post again on this thread. > > >

...

> _______________________________________________ > devs mailing list > devs(a)xwiki.org > http://lists.xwiki.org/mailman/listinfo/devs

...

Vincent Massol

12:45 p.m.

On Sep 21, 2007, at 12:03 PM, Mikhail Kotelnikov wrote: [snip]

...

Note: Velocity or Groovy scripts can generate wiki syntax content and thus these would need to generate new DOM elements. Not sure how easy that would be. I think that with Groovy it can be even simper to work nodes than with Velocity. An example: ------------------------------------- // Velocity: ------------------------------------- Hello $customer.Name! #foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) #end #end $flogger.getPromo( $mud ) ------------------------------------- // Groovy: // (see http://groovy.codehaus.org/GroovyMarkup , // http://groovy.codehaus.org/Builders) ------------------------------------- def xml = new MarkupBuilder() xml.div() { h1("Hello, ${$customer.Name}!") table() { for (mud in mudsOnSpecial) { if (customer.hasPurchased(mud)) { tr(){ td( flogger.getPromo(mud) ) } } } } } -------------------------------------

Sorry for the formatting loss... I couldn't find a way to keep it with my mail client... I still find the velocity version clearer in your example and I think it's way way simpler for non developers.

...

From my point of view the second example is even simpler then the Velocity-based one. The advantages of the Groovy-based stuff: - It generates a well-formed HTML (you have no choice, it is done automatically :-)) - It can be compiled directly to the Java bytecode and cached - It is much more powerful then Velocity - You don't have to learn 2 stuff at the same time - Velocity and Groovy

Currently 90% (a figure I made up ;-)) of xwiki users who are using some scripts only learn a single scripting language: velocity :) Groovy is for developers. It's more powerful definitely but it's for developers. I don't think we can have a single scripting language and I really don't think we should have one. I'd rather we support several: groovy, velocity, jython, beanshell, jruby,etc. In addition with Velocity we control the API we offer to users, limitating the security issues whereas with Groovy we can't do that.

...

- It is possible to write WikiModel specific builders which will be much more efficient then the generic MarkupBuilder from the example above. It will be possible to manipulate with, say, tables in the following way: table[i][j] = "Hello!"; I think that in any way you have to be a geek to write a template :-). You have to understand at least some notions like "variables", "if" conditions, "for" cycles and so on. And IMHO it is simpler to use these structures in a normal programming language. And if you are a "normal" user then for you even Velocity templates are completely unreadable.

I don't agree. There are different levels of users and there's a level that don't know how to program in a full fledged language like Java or Groovy but who know how to do simple thing like: $xwiki.searchDocuments("...") The reason Velocity is successful is because it's simple and has always resisted the temptation to do complex stuff.

...

Another aspect is that if you have errors in your Velocity template it don't save you from exceptions. It it doesn' work in the same way as a bad-written groovy code :-).

[snip]

...

Now you mention removing Velocity. This won't be possible since all current XWiki instances used are using Velocity and we cannot tell our users that they have to rewrite all their pages if they want to move to XWiki v1.3. We'll need to continue supporting Velocity for some time. Personally I currently find that the velocity syntaxes mixes much better with the wiki syntax than groovy. If you look at contributed code snippets you'll see that most are in Velocity which is what most people use. In any case if you change the syntax you will have to process your scripts as well.

We don't change the syntax! We allow other syntaxes. We definitely need to keep the current syntax for a long time to come I think. But I thought this was the main point of using WikiModel: the ability to support several syntaxes :)

...

About the usage of Groovy and templating... I don't like at all the groovy templates where the syntax like <% if (...){%>Hello<%}%> is used.

ok that makes 2 of us ;)

...

It is "inspired" by bad-styled JSP/ASP/PHP... But I really like Groovy builders, as I wrote above. And I have impression that these builders can easily replace Velocity (or other templates).

I wouldn't replace it. It can be used in addition to Velocity.

...

Now you mention other stuff about Jasper and Jetty but I'm not sure I have understood that part. I thought that it would be possible to use the JSP syntax to create templates. So each wiki page can be considered as a jsp page (maybe - if it contains some specific markers in the content) and it can be parsed and compiled as a JSP. The advantage: - And, especially, usage of standard tag libraries (like <c:forEach items="${addresses}" var="address">...</c:forEach>)

Well I don't see that example as an advantagee ;) #foreach ($address in $addresses) definitely sounds better to me ;) But I understand your point. Just not sure yet how beneficial this would be.

...

The standard tag libraries gives the same functionalities (and much more) than Velocity. The advantages: it is a standard, it can be compiled directly in java, ... - Usage of multiple languages (javascript, jpython, jruby, groovy...) if you use the syntax like "<% if (...) {%>Hello!<%}%> (I hate this coding style!) Personally I would like to see the Groovy templating much more than JSP-based one. It was just a proposal...

I'm fine with groovy templates but not by removing Velocity. Rather, in addition to it. What do others think? Thanks -Vincent

...

On Sep 19, 2007, at 6:54 PM, Mikhail Kotelnikov wrote:

Note

> that this is a prerequisite for the new WYSIWYG editor based

on GWT

consolidate the

(example:

table of content generator), this results in dom2 - dom2 is made to available as such or is converted to XML,

HTML, PDF

scripts that

property: a

> combox box with different syntaxes: XWiki Legacy, Creole,

MediaWiki,

> Confluence, JSPWiki, etc) that users can attach to pages to

tell the

> Renderers what syntax is used. If no such object is attached then > it'll default to XWiki's default syntax (XWiki Legacy or Creole

for

> example). > 2. Have some special syntax, independent of the wiki

syntaxes to

> tell the Rendered that such block of content should be rendered

with

that given syntax. Again there would be a default.

Here's our view regarding the syntax used in wiki edit mode:

document

requested for edition are available from the database in a

serialized

format, for instance XHTML. When entering into the edit action, the user indicates his preferred syntax. If the text of the requested

document

contains some blocks that are not handled by the chosen syntax, the user gets a warning (example: the document contains a table as a list

item,

and the user tries to edit the document using JSPWiki syntax).

If not,

WikiModel generates events for blocks that are not to be parsed (typically because they contain scripts). For example, in the WikiModel syntax currently called

"CommonSyntax",

inde

indeed). > * Is the Rendering also in charge of generating PDF, RTF, > XML, etc? > o I think so, need to modify interfaces above to

reflect

this. * The WikiParser needs to recognizes scripts since this is needed for the WYSIWYG editor.

XML,

way to

> make them visible, etc. > o Changes done by the user are entered in HTML.

Note: it

> would be better to capture them so that they are entered in the > "DOM". Is that possible? If not, then the HTMLParser is used to > convert from HTML to Wiki Syntax but they're likely be some

loss in

> the conversion. The advantage is the ability to take any HTML

content

> and generate wiki syntax from it. > > > This is my very earlier thinking but I wanted to make it

visible to

> give everyone the change to 1) know what's happening and 2)

suggest >> ideas. >> >> I'll refine this in the coming days and post again on this thread. >>

_______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs _______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs

Vincent Massol

22 Sep 22 Sep

10:59 a.m.

Thinking more about storing documents into a DOM in the database, I have found 2 issues to discuss: 1) Using verbatim blocks is going to be a nightmare for users. Consider your example below using verbatim blocks: ------------------------------------- // WikiModel: ------------------------------------- <div> <h1>Hello {{{$customer.Name}}}!</h1> <table> {{{#foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) }}} <tr> <td> {{{ $flogger.getPromo( $mud ) }}} </td> </tr> {{{ #end #end }}} </table> </div> It's really ugly and eve worse than the <% from groovy.. :) 2) We need to consider current users who are using velocity intermixed with wiki syntax and we need to continue supporting them, either by having TextProcessors and a VelocityTextProcessor (thus storing text in the DB) or by somehow converting the current way of writing velocity to the "new" way, whatever this is. But in any case we need to find something better than what is in 1) above. I'd hate that it be worse for users. This leads me to believe we might need to keep TextProcessors and store the content in textual format in the DB. WDYT? Is there any other way? Thanks -Vincent On Sep 21, 2007, at 12:45 PM, Vincent Massol wrote:

...

On Sep 21, 2007, at 12:03 PM, Mikhail Kotelnikov wrote: [snip]

Sorry for the formatting loss... I couldn't find a way to keep it with my mail client... I still find the velocity version clearer in your example and I think it's way way simpler for non developers.

Another aspect is that if you have errors in your Velocity template it don't save you from exceptions. It it doesn' work in the same way as a bad-written groovy code :-).

[snip]

About the usage of Groovy and templating... I don't like at all the groovy templates where the syntax like <% if (...){%>Hello<%}% > is used.

ok that makes 2 of us ;)

It is "inspired" by bad-styled JSP/ASP/PHP... But I really like Groovy builders, as I wrote above. And I have impression that these builders can easily replace Velocity (or other templates).

I wouldn't replace it. It can be used in addition to Velocity.

Well I don't see that example as an advantagee ;) #foreach ($address in $addresses) definitely sounds better to me ;) But I understand your point. Just not sure yet how beneficial this would be.

I'm fine with groovy templates but not by removing Velocity. Rather, in addition to it. What do others think? Thanks -Vincent

On Sep 19, 2007, at 6:54 PM, Mikhail Kotelnikov wrote:

Hi Vincent, hi everyone, We discussed the WikiModel integration with Mikhail this

afternoon.

Here is below our input. Vincent Massol wrote: > Hi, > > I've started working on designing the new Rendering/Parsing > components and API for XWiki. The implementation will be based on > WikiModel but we need some XWiki wrapping interfaces around

it. Note

> that this is a prerequisite for the new WYSIWYG editor based

on GWT

pasting

> here so that we can have a discussion about it. I'll

consolidate the

> results on that wiki page. > > Componentize the Parsing/Rendering APIs > ================================== > > We need 4 main components: > > * A Scripting component to manage scripting inside XWiki

documents

> and to evaluate them. On the topic of scripting we would like to propose a distinction between scripts that act on text and scripts that act on the DOM. Typically, the text rendering processing for flow would be the following, for say "text1": text1 =TextProcessor=> text2 =Parser=> dom1 =DomProcessor=> dom2 => ... - the scripts contained in text1 are processed in the context of user1, this results into a new text: text2 - the parser parses text2 and converts text2 to a DOM tree, dom1 - dom1 is processed by scripts that work directly on the DOM

(example:

table of content generator), this results in dom2 - dom2 is made to available as such or is converted to XML,

HTML, PDF

etc. depending on the user request TextProcessor and DomProcessor would have the following

interfaces:

TextProcessor - String execute(String content) DomProcessor - DOM execute(DOM content) That means we should have a syntax to distinguish between

scripts that

XWiki

> content so that it can be manipulated > * A HTML Parser component (for the WYSIWYG editor) > > Different Syntaxes =============== > > Two possible solutions: > > 1. Have a WikiSyntax Object (A simple class with one

property: a

> combox box with different syntaxes: XWiki Legacy, Creole,

MediaWiki,

> Confluence, JSPWiki, etc) that users can attach to pages to

tell the

> Renderers what syntax is used. If no such object is attached then > it'll default to XWiki's default syntax (XWiki Legacy or

Creole for

> example). > 2. Have some special syntax, independent of the wiki

syntaxes to

> tell the Rendered that such block of content should be

rendered with

> that given syntax. Again there would be a default. > Here's our view regarding the syntax used in wiki edit mode:

document

requested for edition are available from the database in a

serialized

format, for instance XHTML. When entering into the edit action,

the

user indicates his preferred syntax. If the text of the requested

document

contains some blocks that are not handled by the chosen syntax,

the

user gets a warning (example: the document contains a table as a

list item,

and the user tries to edit the document using JSPWiki syntax).

If not,

WikiModel converts the serialized format into a DOM, the user

edits

the DOM and the WikiModel serializer serializes it back when the user saves it. Note that the DOM representation of wiki documents in the latest version of WikiModel is still pending. > > XWiki Interfaces > ============= > > * ScriptingEngineManager: Manages the different Scripting > Engines, calling them in turn. > * ScriptingEngine > o Method: evaluate(String content) > o Implementation: VelocityScriptingEngine > o Implementation: GroovyScriptingEngine > * RenderingEngineManager: Manages the different Rendering > Engines, calling them in turn. > * RenderingEngine > o Method: render(String content) > o Implementation: XWikiLegacyRenderingEngine (current > rendering engine) > o Implementation: WikiModelRenderingEngine > * Parser: content parsing > o HTMLParser: parses HTML syntax > o WikiParser: parses wiki syntax > o Implementation: WikiModelHTMLParser > o Implementation: WikiModelWikiParser > > Open Questions: > > * Does WikiModel support a generic syntax for macros? WikiModel generates events for blocks that are not to be parsed (typically because they contain scripts). For example, in the WikiModel syntax currently called

"CommonSyntax",

(we

still have to check it's the case for all types of macro blocks

inde

indeed). > * Is the Rendering also in charge of generating PDF, RTF, > XML, etc? > o I think so, need to modify interfaces above to

reflect

> this. > * The WikiParser needs to recognizes scripts since this is > needed for the WYSIWYG editor. the WikiModel parser recognizes scripts indeed. Mikhail and StÃ©phane > > Use cases > ======== > > * View page > o ViewAction -- template -> > ScriptingEngineManager.evaluate > () -- wiki syntax -> RenderingEngineManager.render() --->

HTML, XML,

> PDF, RTF, etc > * Edit page in WYSIWYG editor > o Uses the WikiParser to create a "DOM" of the page > content and to render it accordingly. NOTE: This is required

since

> rendering in the WYSIWYG editor is different from the final > rendering. For example, macros need to be shown in a special

way to

> make them visible, etc. > o Changes done by the user are entered in HTML.

Note: it

> would be better to capture them so that they are entered in the > "DOM". Is that possible? If not, then the HTMLParser is used to > convert from HTML to Wiki Syntax but they're likely be some

loss in

> the conversion. The advantage is the ability to take any HTML

content

> and generate wiki syntax from it. > > > This is my very earlier thinking but I wanted to make it

visible to

> give everyone the change to 1) know what's happening and 2)

suggest

> ideas. > > I'll refine this in the coming days and post again on this

thread. >>

Mikhail Kotelnikov

24 Sep 24 Sep

12:23 p.m.

Hi! On 9/22/07, Vincent Massol <vincent(a)massol.net> wrote:

...

I would say that your example should be: ---------------------------------------------------- = Header1 = This is a normal wiki syntax... * list item one * list item two And the verbatim block below contains Velocity markup: {{{ <div> <h1>Hello {{{$customer.Name}}}!</h1> <table> #foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) <tr> <td> $flogger.getPromo( $mud ) </td> </tr> #end #end </table> </div> }}} This is a normal wiki syntax again... ---------------------------------------------------- In this case the processing of your verbatim block will lead to generation of the (X)HTML. (It is up to the template implementor to generate a well-formed XHTML). By the way: maybe I was not clear but all "<" and ">" symbols in normal wiki blocks (like paragraphs, tables; not in verbatim blocks) are considered as "special" symbols. For example the "before<table>after" string in the middle of a wiki document will be reported in listeners as following: - onWord: "before" - onSpecialSymbol: "<" - onWord: "table" - onSpecialSymbol: ">" - onWord: "after" 2) We need to consider current users who are using velocity

...

intermixed with wiki syntax and we need to continue supporting them, either by having TextProcessors and a VelocityTextProcessor (thus storing text in the DB) or by somehow converting the current way of writing velocity to the "new" way, whatever this is. But in any case we need to find something better than what is in 1) above. I'd hate that it be worse for users. This leads me to believe we might need to keep TextProcessors and store the content in textual format in the DB.

...

Thanks -Vincent On Sep 21, 2007, at 12:45 PM, Vincent Massol wrote:

On Sep 21, 2007, at 12:03 PM, Mikhail Kotelnikov wrote: [snip]

Sorry for the formatting loss... I couldn't find a way to keep it with my mail client... I still find the velocity version clearer in your example and I think it's way way simpler for non developers.

Another aspect is that if you have errors in your Velocity template it don't save you from exceptions. It it doesn' work in the same way as a bad-written groovy code :-).

[snip]

About the usage of Groovy and templating... I don't like at all the groovy templates where the syntax like <% if (...){%>Hello<%}% > is used.

ok that makes 2 of us ;)

It is "inspired" by bad-styled JSP/ASP/PHP... But I really like Groovy builders, as I wrote above. And I have impression that these builders can easily replace Velocity (or other templates).

I wouldn't replace it. It can be used in addition to Velocity.

Well I don't see that example as an advantagee ;) #foreach ($address in $addresses) definitely sounds better to me ;) But I understand your point. Just not sure yet how beneficial this would be.

I'm fine with groovy templates but not by removing Velocity. Rather, in addition to it. What do others think? Thanks -Vincent

On Sep 19, 2007, at 6:54 PM, Mikhail Kotelnikov wrote:

Hello! Just some words about what the wiki model is and what it is not. The main goal of the WikiModel is the creation of an API giving access and control to the internal structure of individual wiki documents. Some features of the WikiModel: - WikiModel itself does not depend on any particular wiki syntax - The number of possible structural elements and their possible assembling order is strictly fixed (which greatly simplifies the validation and manipulation) but the final result is almost as expressive as XHTML (and even more expressive, taking into account notions of properties and embedded documents which can recursively contain their own embedded documents :-)). - WikiModel manipulates with a super-set of structural elements available in existing wikis. And it has some features not available in other wikis. For example using embedded documents in WikiModel it is possible to put a table in a list and this table can contains its own headers, paragraphs, and lists... Or using embedded documents with the notion of properties it is possible to define very complex structured objects directly on a wiki page. - There is at least one wiki syntax ("Common" syntax) giving access to all features of the Wiki Model. This syntax guaranties that all structural elements of the WikiModel can be serialized/ de-serialized without loose of information and structure. Using any other syntaxes can lead to the information lost (example: you can not put table in a table in XWiki or in JSPWiki which is possible using the Common Syntax). - One of the goals of the WikiModel is to give a mean to *import* information from various wiki engines without information lost. The structure of documents can be serialized in various wiki syntaxes as well, but there is no guaranties that some information will not be lost. The information can be lost in the case when a document contains some elements which have no representation in a particular wiki syntax. Example: properties; tables in lists; parameters of lists, paragraphs, and tables and so on... - All elements managed by the WikiModel can be serialized/ deserialized using XHTML with additional annotations (microformat- like annotations) Some features of the CommonSyntax: - It is a native syntax for the WikiModel. It provides full access to all features of the WikiModel. All structures in the WikiModel can be serizlized/deszerialized in this syntax without any information lost - It uses markup characters available in most (in ideal situation - in all) keyboard layouts (including Russian :-)). So you don't have to switch keyboard layouts to write text, tables, lists and headers. For example tables can be defined using pipe symbols ("|" - which is not available in many keyboard layouts) or the "::" sequence. - If there is a choice then the most commonly used markups are used The current version of the WikiModel provides just an event- based interface to work with the structure of documents (like SAX for XML). In previous versions of WIkiModel I had Document Object Model in which each structural element had its own object representation. In the current version an Object Model is not implemented (yet). I thought to create just a set of utility classes manipulating with the standard XML DOM. Example: the method WikiTable#setCellContent(int row, int column, String content) should create an XHTML table object, create the required number of cells and columns and put the given string content in this node. The same for all other structural elements (headers, lists, internal documents, properties, styles, macros...) On 9/14/07, Vincent Massol <vincent(a)massol.net > wrote: +1 to all that. So let me summarizes and rephrase to see if I have understood :) 1) We have 4 types of objects: * TextProcessors: take text and generate text * Parsers: take text and generate an internal DOM format (pivot format) * DomProcessors: take DOM and generate DOM * Renderers: take DOM and generate anything (text, PDF, RTF, HTML, XML, etc) Yes. 2) Document contents are stored in the database in textual format in the main xwiki syntax (whatever we decide it is - we could standardize on creole for example) It can be the "Common Syntax" for the reasons mentioned above :-). Creole syntax is one of the most restrictive syntaxes. And I tried to uses in the CommonSyntax as much markups of the Creole as possible. An another possibility is to store directly in XML or in XHTML +microformat enhancements (for additional structural elements). pro: - it can be exported/imported directly and used by external applications which knows nothing about wikis; just a standard XML or XHTML - this content can be transformed with XSLT processors directly without usage of the WikiModel - it can be faster to parse XML than the CommonWiki syntax (I have no comparisons) con: - it is more difficult to work with diffs (but for diffs it is *better* to use WkiModel and to generate a specific wiki syntax; for example "Common syntax"); - it is not a "human readable" format; it is difficult to understand what you loads from the DB 3) Use case 1: Viewing a document a) Get the doc from the DB --> text1 (xwiki text format) b) Apply TextProcessors --> text2 c) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an internal DOM) d) Apply DomProcessors --> DOM2 e) Call the required Renderer --> PDF, XML, HTML, RTF, text, etc Yes. 4) Use case 2: Editing a document, assuming the user wants to use the MediaWiki syntax for editing a) Get the doc from the DB --> text1 (xwiki text format) b) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an internal DOM) c) Call MediaWikiRenderer --> text2 (text in MediaWiki format) d) the user edits and hits save e) MediaWikiParser --> DOM2 (transforms MediaWiki text syntax into the internal DOM) f) Call XWikiRenderer --> text" (transforms DOM into xwiki textual format) g) Save text3 in the database Yes. (text1 and text3 can be XML, as I said above) 5) In practice this means the following classes: * TextProcessorManager: to chain several text processors Yes. But it can be just a composite processor implementing the same ProcessorManager interfaces. * TextProcessor - VelocityTextProcessor - GroovyTextProcessor Yes. * WikiParser: Takes wiki syntax and generates a DOM in a XWiki- specific format (independent of the different wiki syntaxes). - LegacyXWikiWikiParser - XWikiWikiParser (or simply use CreoleWikiParser if we want our internal format to be Creole) - ConfluenceWikiParser - MediaWikiWikiParser - JSPWikiWikiParser - CreoleWikiParser - HTMLParser: I think all parsers above need to support HTML since the wiki syntaxes can be mixed with HTML. So this HTMLParser is probably a parent of the other parsers in some regard. Anyway we need this one for the WYSIWYG editor which may need to transform HTML to wiki syntax (so we may need a XWikiDomProcessor too to transform into XWiki syntax). The alternative (much better) is to have the WYSIWYG editor only use the internal XWiki-specific DOM format for all its manipulations. If you want, you can put HTML as a non-interpreted block ("verbatim blocks") and interpret it in the client code. But internally the WikiModel does not support "embedded" (X)HTML. The main reason: in this case I loose control of the document structure. And this control is the main goal of the WikiModel. * DomProcessorManager: to chain several DOM processors * DomProcessor - Don't know yet what we're going to use this for. TOCDomProcessor as you say above maybe. DOMProcessor can be used to transform the original DOM object representing the document in the DB into a new (user and query- specific) DOM object which can contain new elements, generated dynamically. Now all dynamic page elements are interpreted as simple Velocity or Groovy scripts and they generate text documents which should be parsed using Radeox and transformed to the final HTML document. Using the DOM representation it is possible to interpret some nodes of this graph as Groovy scripts. In WikiModel they will correspond to Verbatim blocks which are opaque for WikiModel but they can be interpreted as scripts by the DomProcessor(s). And these "Groovy"-nodes can be executed and they will add new DOM elements to the DOM2. For example this approach can be used to generate search results. The advantages of this approach: - You can put your parsed document DOM1 in the cache, which will avoid you to to parse the document for each query. It is a slowest step in the page processing. Even if the current version of WikiModel is faster than before and it should be faster than Radeox processor. - Your Groovy scripts will manipulate with normal java classes (DOM nodes) and it will produce DOM nodes and not a plain text. It seems especially interesting taking into account Groovy's Builders ( http://groovy.codehaus.org/Builders). It is enough to write a very simple builder (see http://groovy.codehaus.org/ BuilderSupport ) generating DOM nodes and ... voila! Your Groovy node from a wiki page generates search results as DOM nodes! These manipulations with DOM objects should be MUCH faster that process plain text for every request. And all following steps are fast as well - to generate an HTML page it is enough to visit all node with an "XHTMLVisitor". BTW: do you need Velocity at all? Using only Groovy is much cleaner. It can be used as THE language of XWiki. It can be used as a template *and* programming language at the same time. And if you *really* want it is possible to integrate Jasper (from Tomcat) engine to use it for pure templating. The code from Jetty (th e org.mortbay.jetty.jspc.plugin package) can be used as an example of integration with Jasper (see http://jetty.mortbay.org/ xref/index.html). In this case in templates it will be possible to use: - JSP tag libraries (including standard ones) - Multiple scripting languages (like javabeans, javascript, jpython, jruby, groovy,...) * Renderer - XMLRenderer - HTMLRenderer - PDFRenderer - RTFRenderer - XWikiRenderer (or simply use CreoleRenderer if we want our internal format to be Creole) - ConfluenceRenderer - MediaWikiRenderer - JSPWikiRenderer - CreoleRenderer Yes. All these renderers should be written if you want to support all these syntaxes. I think that it should not be very difficult. WDYT? Do I have it right? :) Best regards, Mikhail Thanks -Vincent On Sep 13, 2007, at 6:37 PM, StÃƒ(c)phane LauriÃƒÂ¨re wrote: > Hi Vincent, hi everyone, > > We discussed the WikiModel integration with Mikhail this afternoon. > Here > is below our input. > > Vincent Massol wrote: >> Hi, >> >> I've started working on designing the new Rendering/Parsing >> components and API for XWiki. The implementation will be based on >> WikiModel but we need some XWiki wrapping interfaces around it. Note >> that this is a prerequisite for the new WYSIWYG editor based on GWT >> (see http://www.xwiki.org/xwiki/bin/view/Design/ >> NewWysiwygEditorBasedOnGwt). >> >> I've updated http://www.xwiki.org/xwiki/bin/view/Design/ >> WikiModelIntegration with the information below, which I'm pasting >> here so that we can have a discussion about it. I'll consolidate the >> results on that wiki page. >> >> Componentize the Parsing/Rendering APIs >> ================================== >> >> We need 4 main components: >> >> * A Scripting component to manage scripting inside XWiki documents >> and to evaluate them. > > On the topic of scripting we would like to propose a distinction > between > scripts that act on text and scripts that act on the DOM. > Typically, the > text rendering processing for flow would be the following, for say > "text1": > > text1 =TextProcessor=> text2 =Parser=> dom1 =DomProcessor=> dom2 > => ... > > - the scripts contained in text1 are processed in the context of > user1, > this results into a new text: text2 > - the parser parses text2 and converts text2 to a DOM tree, dom1 > - dom1 is processed by scripts that work directly on the DOM (example: > table of content generator), this results in dom2 > - dom2 is made to available as such or is converted to XML, HTML, PDF > etc. depending on the user request > > TextProcessor and DomProcessor would have the following interfaces: > > TextProcessor > - String execute(String content) > > DomProcessor > - DOM execute(DOM content) > > That means we should have a syntax to distinguish between scripts that > generate text content, and scripts that manipulate the DOM. > >> * A Rendering component to manage rendering Wiki syntax into >> HTML and other (PDF, RTF, etc) >> * A Wiki Parser component to offer a typed interface to XWiki >> content so that it can be manipulated >> * A HTML Parser component (for the WYSIWYG editor) >> >> Different Syntaxes =============== >> >> Two possible solutions: >> >> 1. Have a WikiSyntax Object (A simple class with one property: a >> combox box with different syntaxes: XWiki Legacy, Creole, MediaWiki, >> Confluence, JSPWiki, etc) that users can attach to pages to tell the >> Renderers what syntax is used. If no such object is attached then >> it'll default to XWiki's default syntax (XWiki Legacy or Creole for >> example). >> 2. Have some special syntax, independent of the wiki syntaxes to >> tell the Rendered that such block of content should be rendered with >> that given syntax. Again there would be a default. >> > > Here's our view regarding the syntax used in wiki edit mode: document > requested for edition are available from the database in a serialized > format, for instance XHTML. When entering into the edit action, the > user > indicates his preferred syntax. If the text of the requested document > contains some blocks that are not handled by the chosen syntax, the > user > gets a warning (example: the document contains a table as a list item, > and the user tries to edit the document using JSPWiki syntax). If not, > WikiModel converts the serialized format into a DOM, the user edits > the > DOM and the WikiModel serializer serializes it back when the user > saves it. > > Note that the DOM representation of wiki documents in the latest > version > of WikiModel is still pending. > >> >> XWiki Interfaces >> ============= >> >> * ScriptingEngineManager: Manages the different Scripting >> Engines, calling them in turn. >> * ScriptingEngine >> o Method: evaluate(String content) >> o Implementation: VelocityScriptingEngine >> o Implementation: GroovyScriptingEngine >> * RenderingEngineManager: Manages the different Rendering >> Engines, calling them in turn. >> * RenderingEngine >> o Method: render(String content) >> o Implementation: XWikiLegacyRenderingEngine (current >> rendering engine) >> o Implementation: WikiModelRenderingEngine >> * Parser: content parsing >> o HTMLParser: parses HTML syntax >> o WikiParser: parses wiki syntax >> o Implementation: WikiModelHTMLParser >> o Implementation: WikiModelWikiParser >> >> Open Questions: >> >> * Does WikiModel support a generic syntax for macros? > > WikiModel generates events for blocks that are not to be parsed > (typically because they contain scripts). > > For example, in the WikiModel syntax currently called "CommonSyntax", > this looks like the following: > ============== > {{{macro:mymacro (String parameters) > dothis > dothat > > }}} > > > $mymacro(parameters) > ============== > > For each syntax, macro blocks are identified as far as possible (we > still have to check it's the case for all types of macro blocks inde > indeed). > > >> * Is the Rendering also in charge of generating PDF, RTF, >> XML, etc? >> o I think so, need to modify interfaces above to reflect >> this. >> * The WikiParser needs to recognizes scripts since this is >> needed for the WYSIWYG editor. > > the WikiModel parser recognizes scripts indeed. > > > Mikhail and StÃƒ(c)phane > >> >> Use cases >> ======== >> >> * View page >> o ViewAction -- template -> >> ScriptingEngineManager.evaluate >> () -- wiki syntax -> RenderingEngineManager.render() ---> HTML, XML, >> PDF, RTF, etc >> * Edit page in WYSIWYG editor >> o Uses the WikiParser to create a "DOM" of the page >> content and to render it accordingly. NOTE: This is required since >> rendering in the WYSIWYG editor is different from the final >> rendering. For example, macros need to be shown in a special way to >> make them visible, etc. >> o Changes done by the user are entered in HTML. Note: it >> would be better to capture them so that they are entered in the >> "DOM". Is that possible? If not, then the HTMLParser is used to >> convert from HTML to Wiki Syntax but they're likely be some loss in >> the conversion. The advantage is the ability to take any HTML content >> and generate wiki syntax from it. >> >> >> This is my very earlier thinking but I wanted to make it visible to >> give everyone the change to 1) know what's happening and 2) suggest >> ideas. >> >> I'll refine this in the coming days and post again on this thread. >>

_______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs

Vincent Massol

1:28 p.m.

On Sep 24, 2007, at 12:23 PM, Mikhail Kotelnikov wrote:

...

Hi! On 9/22/07, Vincent Massol <vincent(a)massol.net > wrote: Thinking more about storing documents into a DOM in the database, I have found 2 issues to discuss: 1) Using verbatim blocks is going to be a nightmare for users. Consider your example below using verbatim blocks: ------------------------------------- // WikiModel: ------------------------------------- <div> <h1>Hello {{{$customer.Name}}}!</h1> <table> {{{#foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) }}} <tr> <td> {{{ $flogger.getPromo( $mud ) }}} </td> </tr> {{{ #end #end }}} </table> </div> It's really ugly and eve worse than the <% from groovy.. :) I would say that your example should be: ---------------------------------------------------- = Header1 = This is a normal wiki syntax... * list item one * list item two And the verbatim block below contains Velocity markup: {{{ <div> <h1>Hello {{{$customer.Name}}}!</h1> <table> #foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) <tr> <td> $flogger.getPromo( $mud ) </td> </tr> #end #end </table> </div> }}} This is a normal wiki syntax again... ---------------------------------------------------- In this case the processing of your verbatim block will lead to generation of the (X)HTML. (It is up to the template implementor to generate a well-formed XHTML). By the way: maybe I was not clear but all "<" and ">" symbols in normal wiki blocks (like paragraphs, tables; not in verbatim blocks) are considered as "special" symbols. For example the "before<table>after" string in the middle of a wiki document will be reported in listeners as following: - onWord: "before" - onSpecialSymbol: "<" - onWord: "table" - onSpecialSymbol: ">" - onWord: "after"

Right I was wrong, the HTML is indeed inside the verbatim block. BTW, we'll need a way to parse verbatim blocks to know which parser to be used (Groovy, Velocity, etc). My example was bad but I think the issue still stands, just not as a bad as with HTML. The points I was making is that having to force velocity syntax into verbatim blocks makes unfriendly which isn't good and doesn't provide a fluent integration of velocity with wiki syntax. I think this was one of the nice thing in current xwiki.

...

2) We need to consider current users who are using velocity intermixed with wiki syntax and we need to continue supporting them, either by having TextProcessors and a VelocityTextProcessor (thus storing text in the DB) or by somehow converting the current way of writing velocity to the "new" way, whatever this is. But in any case we need to find something better than what is in 1) above. I'd hate that it be worse for users. This leads me to believe we might need to keep TextProcessors and store the content in textual format in the DB. You can have the following scenarios: Scenario A: Processing of the source for each request 1. You load your content from the DB 2. You process the content using a template engine (Velocity/ Freemarker/StringTemplate/...). The whole page is considered as a simple template for the corresponding template engine. 3. The results of such a processing is parsed by the WikiModel parsers. pro: You template engine can be used to generate additional wiki elements. No needs to use "embedded blocks" and stuff like that con: You have to repeat the operations 2) and 3) for *each request*. These steps are the slowest steps in the page processing. Scenario B: You process only some verbatim blocks containing template-based "inclusions". 1. You load your page from the DB as an XML document or as a wiki syntax 2. You parse it and transform the content into an in-memory structure (DOM). This object can be cached in memory. 3. You handle only some verbatim blocks in this DOM structure to transform its content using a template engine. pro: For each query you repeat only the step 3. The initial DOM structure is the same and it can be cached. In this case we avoid the parsing the wiki document from wiki syntax or DOM con: You can not generate the wiki syntax. Your template blocks have to generate the resulting (X)HTML. Personally I prefer the Scenario B. In this case some verbatim blocks can be considered as an HTML/TeX/... markup, as Groovy/ Javascript/JPython/JRuby/... script blocks and some - as template blocks (Velocity/Freemarker/StringTemplate/...)

Yes, this is exactly my point. I don't see a way to have the pros of both A and B. I'm leaning towards B right now, not for performance reasons of course (for this B is better) but for usability and backward compatibility. When say you "prefer" solution B, I'm pretty sure you're preferring it as a developer but not as an end user. The following: {{{ #foreach ($link in $doc.backlinks) * [whatever>$link] #end }}} Is always more complex for users than: #foreach ($link in $doc.backlinks) * [whatever>$link] #end I feel users will write stuff like this: {{{ full content of the page containing wiki syntax and velocity markup }}} which means we're back to square one and to solution A, since we can have velocity markup include wiki syntax... This in term of performance I don't think A and B will be very different. Only advantage of B is that we can define a notation to specify which templating engine to use. Something like: {{{ velocity|freemarker: content }}} This would be in addition to the macro definition presented by Stephane in an earlier message: {{{macro:mymacro (String parameters) dothis dothat }}} BTW Stephane/Mikhail, how do you call macros? Do you call them using a verbatim block too? Also is there a generic support in WikiModel for variables? For example: A list: * item $myitem

...

BTW: I created a very simple common template API which is available in the public Nepomuk repository. There are two implementations for this API: a Velocity and a Freemarker-based on (see Freemarker here: http://freemarker.org/) I think it is easy to implement the same API for StringTemplate (http://www.stringtemplate.org/). It may be useful for the future implementations of the XWiki templating...

Ok good to know in case we need it. [snip] Thanks -Vincent

Erin Schnabel

7:14 p.m.

I'm sure I missed something in the above, I was skimming pretty fast... I have, for example, a pretty hefty macro to create a navigation menu (corporate standards for look/feel, plus some pretty fun prototype stuff to add in dynamic bits.. ). It's a pretty awesome piece of code.. but it is 100% in velocity. I've used groovy only a handful of times, and all of them were in those cases where I had to do something velocity just wouldn't allow me to do. We have a few problems, as I see it: unless all of the skin templates have changed in 1.1, all of the templates currently use velocity (which just asks those of us writing our own skins based on the xwiki skins as a model to use velocity). Most of the pages (and I'm not kidding about that.. we have a lot of pages displaying objects from other pages, etc) use velocity. Any kind of "translate velocity to groovy" incurs more overhead for the velocity pages. Even if the groovy code is cached, you still have the velocity-groovy conversion to worry about. How is that stored, anyway? I need to figure out more about what is and isn't cached and where. The groovy vs. velocity discussion seems like one of those "religious" debates. Regardless of which macro language you prefer, the bottom line is that users have a crap-ton of velocity code in use-- you can't just drop the language, or incur a lot of overhead converting it into something else. After looking @ all the samples, I'm really NOT happy with all the {{{ }}} crap. None of my pages have that in there now, and I think it looks ugly and unreadable and just ... UGH. If I have to go and update ALL of my pages to add those silly characters I'll totally lose my mind! So what did I miss.. are those silly characters actually required with this new environment? if so, something is seriously broken. IMO, anyhow. ;) On 9/24/07, Vincent Massol <vincent(a)massol.net> wrote:

...

On Sep 24, 2007, at 12:23 PM, Mikhail Kotelnikov wrote: Hi! On 9/22/07, Vincent Massol <vincent(a)massol.net > wrote:

Thinking more about storing documents into a DOM in the database, I have found 2 issues to discuss:

2) We need to consider current users who are using velocity

You can have the following scenarios: Scenario A: Processing of the source for each request 1. You load your content from the DB 2. You process the content using a template engine (Velocity/Freemarker/StringTemplate/...). The whole page is considered as a simple template for the corresponding template engine. 3. The results of such a processing is parsed by the WikiModel parsers. pro: You template engine can be used to generate additional wiki elements. No needs to use "embedded blocks" and stuff like that con: You have to repeat the operations 2) and 3) for *each request*. These steps are the slowest steps in the page processing. Scenario B: You process only some verbatim blocks containing template-based "inclusions". 1. You load your page from the DB as an XML document or as a wiki syntax 2. You parse it and transform the content into an in-memory structure (DOM). This object can be cached in memory. 3. You handle only some verbatim blocks in this DOM structure to transform its content using a template engine. pro: For each query you repeat only the step 3. The initial DOM structure is the same and it can be cached. In this case we avoid the parsing the wiki document from wiki syntax or DOM con: You can not generate the wiki syntax. Your template blocks have to generate the resulting (X)HTML. Personally I prefer the Scenario B. In this case some verbatim blocks can be considered as an HTML/TeX/... markup, as Groovy/Javascript/JPython/JRuby/... script blocks and some - as template blocks (Velocity/Freemarker/StringTemplate/...) Yes, this is exactly my point. I don't see a way to have the pros of both A and B. I'm leaning towards B right now, not for performance reasons of course (for this B is better) but for usability and backward compatibility. When say you "prefer" solution B, I'm pretty sure you're preferring it as a developer but not as an end user. The following: {{{ #foreach ($link in $doc.backlinks) * [whatever>$link] #end }}} Is always more complex for users than: #foreach ($link in $doc.backlinks) * [whatever>$link] #end I feel users will write stuff like this: {{{ full content of the page containing wiki syntax and velocity markup }}} which means we're back to square one and to solution A, since we can have velocity markup include wiki syntax... This in term of performance I don't think A and B will be very different. Only advantage of B is that we can define a notation to specify which templating engine to use. Something like: {{{ velocity|freemarker: content }}} This would be in addition to the macro definition presented by Stephane in an earlier message: {{{macro:mymacro (String parameters) dothis dothat }}} BTW Stephane/Mikhail, how do you call macros? Do you call them using a verbatim block too? Also is there a generic support in WikiModel for variables? For example: A list: * item $myitem BTW: I created a very simple common template API which is available in the public Nepomuk repository. There are two implementations for this API: a Velocity and a Freemarker-based on (see Freemarker here: http://freemarker.org/) I think it is easy to implement the same API for StringTemplate (http://www.stringtemplate.org/). It may be useful for the future implementations of the XWiki templating... Ok good to know in case we need it. [snip] Thanks -Vincent

-- 'Waste of a good apple' -Samwise Gamgee

Vincent Massol

25 Sep 25 Sep

9:10 a.m.

Hi Erin, On Sep 24, 2007, at 7:14 PM, Erin Schnabel wrote:

...

As I stated in past emails by current view is that we need to keep Velocity if only as a backward compatibility layer. I think it's more than that though since: 1) it's simple 2) it mingles well with xwiki syntax

...

Any kind of "translate velocity to groovy" incurs more overhead for the velocity pages. Even if the groovy code is cached, you still have the velocity-groovy conversion to worry about. How is that stored, anyway? I need to figure out more about what is and isn't cached and where.

If there were a conversion (which is definitely not agreed upon at this point) it would be done once.

...

The groovy vs. velocity discussion seems like one of those "religious" debates. Regardless of which macro language you prefer, the bottom line is that users have a crap-ton of velocity code in use-- you can't just drop the language, or incur a lot of overhead converting it into something else.

We all agree. Mikhail was mentioning that because he's thinking about the best architecture for XWiki starting with a clean slate. On my side, I do the opposite and start with current XWiki and see how we could go towards Mikhail's ideas without breaking too much stuff ;)

...

After looking @ all the samples, I'm really NOT happy with all the {{{ }}} crap. None of my pages have that in there now, and I think it looks ugly and unreadable and just ... UGH. If I have to go and update ALL of my pages to add those silly characters I'll totally lose my mind! So what did I miss.. are those silly characters actually required with this new environment? if so, something is seriously broken.

I think those {{{ verbatim blocks can serve a useful purpose: they tell XWiki what to do with the verbatim block. I can see the following uses: * Definition of a macro: {{{ macro: * Definition of a Velocity block: {{{ velocity: * Definition of a Freemarker block (or whatever other template system): {{{ freemarker: * Definition of a Groovy block: {{{ groovy: However, as you mentioned we need to keep backward compatibility. We could have a xwiki.compatibility=1 config property to keep the current behavior. Internally, this would mean: * Adding {{{ velocity: at the top of all existing pages before the rendering is done * Transforming <% to {{{ groovy: I'm personally not too fond of the {{{ marker either but: 1) It's the only way I can think of that allows us to save the document as a DOM tree in the DB and thus increase performances (it still needs to be proved it increases performances - Mikhail/ Stephane, would be nice if you could send some stats). Note that the performances will be improved ONLY for the part of the document not inside a {{{ }}} block since those blocks are texts that need to be parsed, thus incurring the speed penalty of parsing. 2) It allows to add other templating/scripting engines in a generic manner. If we add other templating solutions it would be too costly and non deterministic to apply them one after another. 3) It offers a common syntax for all extensions to the wiki syntax: macros, scripts, templates, etc. Question for Mikhail: is it possible to change the characters representing the verbatim block in a given wiki syntax. For example say I'm using the wiki syntax. Ideally I would have preferred the following kind of syntax: {velocity} {/velocity} {groovy} {/groovy} {jsp} {/jsp} {macro} {/macro} with the ability to pass parameters: {macro:param1=value1| param2=value2...} Is that possible with WikiModel? Thanks -Vincent

...

On 9/24/07, Vincent Massol <vincent(a)massol.net> wrote:

On Sep 24, 2007, at 12:23 PM, Mikhail Kotelnikov wrote: Hi! On 9/22/07, Vincent Massol <vincent(a)massol.net > wrote:

Thinking more about storing documents into a DOM in the database, I have found 2 issues to discuss:

2) We need to consider current users who are using velocity

-- 'Waste of a good apple' -Samwise Gamgee _______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs

Vincent Massol

26 Sep 26 Sep

8:34 p.m.

New subject: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

Hi xwiki devs, This is a summary of the decisions so far and the remaining questions. This is also the outcome of my discussion of today with Mikhail on skype. 1) We'll be able to import all syntaxes. 2) An XWiki instance will use a single syntax at a time. Once the database is created using that syntax it won't be possible to change it (except by doing an export and reimport). We also need to decide what syntax we use by default OOB. I propose we use the current xwiki syntax for some time and then switch to the Wikimodel one (Common Syntax) later on. 3) All pages will be able to be exported to any syntax. Some elements have no equivalent in other syntaxes and when this happens a warning will be displayed and the elements in question ignored. 4) Mikhail has agreed to modify wikiModel to add macro block recognition.The syntax isn't fully defined yet but it'll be something like: {xxx param1=value1 param2=value2} ... {/xxx} (param1='value value' and param1="value value" will also be supported) This means we'll be able to have a common syntax for xwiki's macros and also for groovy code: {groovy} ... {/groovy} And also for HTML blocks: {html} ... {/html} 5) We need to decide if we want: A) No velocity block but document properties/metadata to tell xwiki to render the page using velocity. A user putting velocity code in a page will have to check a box somewhere to say that this is a velocity page. Pros: * Slightly easier to enter velocity code Cons: * Exception case compared to groovy, macros, etc * User must not forget to check the checkbox saying the page contains velocity code OR B) Velocity blocks same as what exists for groovy/macros/html, namely: {velocity} ... velocity code with wiki syntax allowed {/velocity} Note1: For B) we would allow putting wiki syntax inside the velocity block. Technically we'll apply the velocity rendering and then re- apply wikimodel on the result. Note2: For backward compatibility we can have a config flag (xwiki.compatibility = 1) that automatically adds the {velocity}{/ velocity} marker around the whole page. The only downside is that it'll be as slow as it currently is (actually it'll be faster since wikimodel is going to be faster than radeox) Pros: * Speed. Since we know the blocks that use velocity we can cache all the wiki syntax not inside the velocity/macros/groovy blocks which will speed up considerably the rendering of pages * Consistency with macros and groovy blocks. My preferences goes to B) and I'm proposing to use that. 6) Mikhail is going to add support for recognizing XML tags in the wikimodel parser so that onOpenXmlTag()/onCloseXmlTag() events are called in listeners). This is needed for point 7 below. 7) We need to allow intermixing velocity/HTML and wiki syntax easily. For this our listener (the code that listens to {velocity} events) will evaluate the content using velocity and will call wiki model again on the resulting code. Since wikimodel requires HTML to be in a block ({html} for us) we'll use a different wikimodel listener that intercepts the onOpenXmlTag/onCloseXmlTag so that it'll output XML tags with no modifications (the standard HTML listener generates < and > for < and >). This will allow writing: {velocity} <div> <h1>Hello $customer.Name!</h1> <table> #foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) <tr> <td> * [link>$flogger.getPromo( $mud )] </td> </tr> #end #end </table> </div> {/velocity} 8) Documents are stored in textual format in the DB (i.e. as the user sees them). Portions of them will be cached after they're rendered for the first time (see option B above for the best caching option). Are you ok on these points and especially about using the 5B solution? Anything else I've forgotten? If we agree, then my next steps are: * Understand the wikimodel API in more details * Understand the doxia API too (it's a "competitor" to wikimodel). The reason is that I'd like to see two implementations to ensure that the XWiki Interfaces can be implemented using different implementations so that XWiki is independent of the underlying rendering/parsing framework used. Jason Van Zyl is also interested in implementing the doxia part for XWiki in the future. * Propose a XWiki API * Propose an integration path * Implement it using WikiModel Thanks -Vincent

Vincent Massol

8:41 p.m.

New subject: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

Does anyone have use cases that they would like to see covered? (to test if the design below works for them) Thanks -Vincent On Sep 26, 2007, at 8:34 PM, Vincent Massol wrote:

...

Hi xwiki devs, This is a summary of the decisions so far and the remaining questions. This is also the outcome of my discussion of today with Mikhail on skype. 1) We'll be able to import all syntaxes. 2) An XWiki instance will use a single syntax at a time. Once the database is created using that syntax it won't be possible to change it (except by doing an export and reimport). We also need to decide what syntax we use by default OOB. I propose we use the current xwiki syntax for some time and then switch to the Wikimodel one (Common Syntax) later on. 3) All pages will be able to be exported to any syntax. Some elements have no equivalent in other syntaxes and when this happens a warning will be displayed and the elements in question ignored. 4) Mikhail has agreed to modify wikiModel to add macro block recognition.The syntax isn't fully defined yet but it'll be something like: {xxx param1=value1 param2=value2} ... {/xxx} (param1='value value' and param1="value value" will also be supported) This means we'll be able to have a common syntax for xwiki's macros and also for groovy code: {groovy} ... {/groovy} And also for HTML blocks: {html} ... {/html} 5) We need to decide if we want: A) No velocity block but document properties/metadata to tell xwiki to render the page using velocity. A user putting velocity code in a page will have to check a box somewhere to say that this is a velocity page. Pros: * Slightly easier to enter velocity code Cons: * Exception case compared to groovy, macros, etc * User must not forget to check the checkbox saying the page contains velocity code OR B) Velocity blocks same as what exists for groovy/macros/html, namely: {velocity} ... velocity code with wiki syntax allowed {/velocity} Note1: For B) we would allow putting wiki syntax inside the velocity block. Technically we'll apply the velocity rendering and then re-apply wikimodel on the result. Note2: For backward compatibility we can have a config flag (xwiki.compatibility = 1) that automatically adds the {velocity}{/ velocity} marker around the whole page. The only downside is that it'll be as slow as it currently is (actually it'll be faster since wikimodel is going to be faster than radeox) Pros: * Speed. Since we know the blocks that use velocity we can cache all the wiki syntax not inside the velocity/macros/groovy blocks which will speed up considerably the rendering of pages * Consistency with macros and groovy blocks. My preferences goes to B) and I'm proposing to use that. 6) Mikhail is going to add support for recognizing XML tags in the wikimodel parser so that onOpenXmlTag()/onCloseXmlTag() events are called in listeners). This is needed for point 7 below. 7) We need to allow intermixing velocity/HTML and wiki syntax easily. For this our listener (the code that listens to {velocity} events) will evaluate the content using velocity and will call wiki model again on the resulting code. Since wikimodel requires HTML to be in a block ({html} for us) we'll use a different wikimodel listener that intercepts the onOpenXmlTag/onCloseXmlTag so that it'll output XML tags with no modifications (the standard HTML listener generates < and > for < and >). This will allow writing: {velocity} <div> <h1>Hello $customer.Name!</h1> <table> #foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) <tr> <td> * [link>$flogger.getPromo( $mud )] </td> </tr> #end #end </table> </div> {/velocity} 8) Documents are stored in textual format in the DB (i.e. as the user sees them). Portions of them will be cached after they're rendered for the first time (see option B above for the best caching option). Are you ok on these points and especially about using the 5B solution? Anything else I've forgotten? If we agree, then my next steps are: * Understand the wikimodel API in more details * Understand the doxia API too (it's a "competitor" to wikimodel). The reason is that I'd like to see two implementations to ensure that the XWiki Interfaces can be implemented using different implementations so that XWiki is independent of the underlying rendering/parsing framework used. Jason Van Zyl is also interested in implementing the doxia part for XWiki in the future. * Propose a XWiki API * Propose an integration path * Implement it using WikiModel Thanks -Vincent _______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs

Erin Schnabel

27 Sep 27 Sep

10:15 p.m.

New subject: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

On 9/26/07, Vincent Massol <vincent(a)massol.net> wrote:

...

This is the only thing that makes sense to me.

...

We also need to decide what syntax we use by default OOB. I propose we use the current xwiki syntax for some time and then switch to the Wikimodel one (Common Syntax) later on. 3) All pages will be able to be exported to any syntax. Some elements have no equivalent in other syntaxes and when this happens a warning will be displayed and the elements in question ignored.

OUCH. It's the only possible answer, but OUCH. I can just see the complaints about data-loss piling up.. This will have to be well-documented, perhaps including some tips about a) how to back things up for real, and b) how to identify those areas later to assess the damage... Would you convert the syntax in the document history, too? Or does a switch in syntax imply that the history is wiped... the document history is very sensitive to random mucking around, I've found.

...

4) Mikhail has agreed to modify wikiModel to add macro block recognition.The syntax isn't fully defined yet but it'll be something like: {xxx param1=value1 param2=value2} ... {/xxx} (param1='value value' and param1="value value" will also be supported) This means we'll be able to have a common syntax for xwiki's macros and also for groovy code: {groovy} ... {/groovy} And also for HTML blocks: {html} ... {/html}

Consistency is good, as is the lack of smashed-together-curly-braces. If we demarcate all scripts in nice little blocks like this, will the WYSIWYG editor be smart enough to leave it alone?

...

5) We need to decide if we want: A) No velocity block but document properties/metadata to tell xwiki to render the page using velocity. A user putting velocity code in a page will have to check a box somewhere to say that this is a velocity page.

...

OR B) Velocity blocks same as what exists for groovy/macros/html, namely: {velocity} ... velocity code with wiki syntax allowed {/velocity} Note1: For B) we would allow putting wiki syntax inside the velocity block. Technically we'll apply the velocity rendering and then re-apply wikimodel on the result.

...

My preferences goes to B) and I'm proposing to use that.

The compatibility flag in xwiki.cfg would help, but the problem is that I wouldn't ever be able to turn the flag off if I have a large legacy database... which means none of my new documents get whatever performance benefit might be gained. I would like to say: "I'm running in a compatibility mode, and have flagged all of the documents that pre-dated the new syntax processing model. Shame on me if we don't start using the scripting blocks for new documents." As far as adding {velocity} {/velocity} around content when in compatibility mode.. that would apply to all pages, which means all content would get passed through the velocity renderer, right? I would think that whether or not wikitext (or raw html for that matter) is allowed within a script block {script} {/script} would depend on the nature of the language being invoked, right? Velocity is easily integrated with wikitext and html without a lot of fuss, so, in my view, you would be obligated to allow html and wikitext within that block (generally agreed). An html block would specify that you wouldn't want any wiki parsing at all, right? (wait a minute, isn't that what the pre tags were supposed to do?).

...

6) Mikhail is going to add support for recognizing XML tags in the wikimodel parser so that onOpenXmlTag()/onCloseXmlTag() events are called in listeners). This is needed for point 7 below. 7) We need to allow intermixing velocity/HTML and wiki syntax easily. For this our listener (the code that listens to {velocity} events) will evaluate the content using velocity and will call wiki model again on the resulting code. Since wikimodel requires HTML to be in a block ({html} for us) we'll use a different wikimodel listener that intercepts the onOpenXmlTag/onCloseXmlTag so that it'll output XML tags with no modifications (the standard HTML listener generates < and > for < and

). This will allow writing:

{velocity} <div> <h1>Hello $customer.Name!</h1> <table> #foreach( $mud in $mudsOnSpecial ) #if ( $customer.hasPurchased($mud) ) <tr> <td> * [link>$flogger.getPromo( $mud )] </td> </tr> #end #end </table> </div> {/velocity}

It seems there is this assumption that people either write wiki OR they write html. I write both. At the same time. I am a nightmare for anyone even thinking about stuff in neat little either/or packages. I write a bunch of wikitext, and then throw some div tags around bits of it, or to make a specific kind of heading that the silly syntax won't generate for me, or.. I don't know that much about wikimodel, but I have to assume it isn't going to complain if you intermix html in your wikitext. The only benefit I see of the {html}{/html} tag is if the WYSIWYG editor will then be smart enough not to bother messing with it. at all. Does the templating model change? All of the templates/skins to date use velocity, and are rendered using velocity, which seems not to be an issue. Am I right in assuming (therefore) that this discussion applies only to "documents" in the DB, and not to templates or skins? What happens when portions of the templates are defined in a skin document in the DB? It seems semi-tangential to ask questions about skin/template rendering, but I feel there is a significant amount of interdependency between the skin and the content, especially if you consider the #panel macros, for example. The #panel macros are a) velocity macros, b) defined in files in the filesystem, and c) are included and used by every document defining a panel. Do changes to the rendering/parsing API require skin/template *.vm files to be updated with {velocity} {/velocity} tags? Or would the velocity engine continue to pre-read and cache those files as it does now... A bunch of other random questions: Is there something we have to keep in mind here RE: document content vs. object field content (which already has a flavor of content rendering attribute)? Would you have to make script {script} {/script} blocks in textarea content of objects? (I'm thinking Panels, so I'd say yes). Will the page parser panic if you either include the content of a field containing a block within a different content block? e.g. {velocity} .. find an object of type x (with field y) display object field y: {groovy} crazy scripting, man {/groovy} {/velocity} As far as use cases go, mine is just this: A LOT of pages (some of which have an extensive history and a lot of attachments): largest portion have mix of html and wiki; second largest is mix of velocity, html, and wiki; last is pages with groovy (usually on pages also containing html, wiki, AND velocity). That was rather stream-of-consciousness. So tell me if I was confusing. -- 'Waste of a good apple' -Samwise Gamgee

Vincent Massol

28 Sep 28 Sep

12:05 p.m.

New subject: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

Hi Erin, On Sep 27, 2007, at 10:15 PM, Erin Schnabel wrote:

...

On 9/26/07, Vincent Massol <vincent(a)massol.net> wrote:

This is the only thing that makes sense to me.

These exports would be only useful for people moving away from xwiki I think and they would be useful only to import into another wiki. The way to import history in each wiki is different I guess so saving the history in a format specific to each wiki will be too much work for us so I'd say not to exporting the history. However exporting any version of a page is doable easily.

...

Consistency is good, as is the lack of smashed-together-curly-braces. If we demarcate all scripts in nice little blocks like this, will the WYSIWYG editor be smart enough to leave it alone?

Yep that's the point too but it means anything inside will be left alone, including wiki syntax and HTML. But that's fine I think.

...

My preferences goes to B) and I'm proposing to use that.

You'll get some benefits (since Wikimodel is faster than Radeox) but not all performance gains.

...

I would like to say: "I'm running in a compatibility mode, and have flagged all of the documents that pre-dated the new syntax processing model. Shame on me if we don't start using the scripting blocks for new documents."

Yes I guess that's possible but requires more work. We have so much work to do already that I think this would be left for later or left for the community to implement with patches.

...

As far as adding {velocity} {/velocity} around content when in compatibility mode.. that would apply to all pages, which means all content would get passed through the velocity renderer, right?

Right

...

I would think that whether or not wikitext (or raw html for that matter) is allowed within a script block {script} {/script} would depend on the nature of the language being invoked, right?

Right. It would be allowe for Velocity but not for Groovy for example.

...

Velocity is easily integrated with wikitext and html without a lot of fuss, so, in my view, you would be obligated to allow html and wikitext within that block (generally agreed).

Yep, that's what I proposed.

...

An html block would specify that you wouldn't want any wiki parsing at all, right?

Yes

...

(wait a minute, isn't that what the pre tags were supposed to do?).

Yes. We'll deprecate the pre tag then since an html tag makes more sense and is better named IMO.

...

). This will allow writing:

It seems there is this assumption that people either write wiki OR they write html. I write both.

Ah I see... You mean when not using the {velocity} tag I guess. Good point.... hmm.... I guess we could allow the {html} macro to interpret wiki syntax too. We could have a parameter like this: {html renderWikiSyntax=true} ... {html} WDYT?

...

At the same time. I am a nightmare for anyone even thinking about stuff in neat little either/or packages. I write a bunch of wikitext, and then throw some div tags around bits of it, or to make a specific kind of heading that the silly syntax won't generate for me, or.. I don't know that much about wikimodel, but I have to assume it isn't going to complain if you intermix html in your wikitext. The only benefit I see of the {html}{/html} tag is if the WYSIWYG editor will then be smart enough not to bother messing with it. at all.

Yes, it's possible to also have XHTML (not HTML) with the new onOpenXmlTag/onCloseXmlTag() that Mikhail is going to add. So we could have a listener that outputs the XHTML as is. It's a good question, I'm not sure sure which is best. Right now I'd be tempted to use the {html} notation proposed above (with the renderWikisyntax parameter).

...

Does the templating model change? All of the templates/skins to date use velocity, and are rendered using velocity, which seems not to be an issue. Am I right in assuming (therefore) that this discussion applies only to "documents" in the DB, and not to templates or skins? What happens when portions of the templates are defined in a skin document in the DB?

Templates will be treated the same as documents (as is the case now I think). The templates would use the same {velocity} macro.

...

It seems semi-tangential to ask questions about skin/template rendering, but I feel there is a significant amount of interdependency between the skin and the content, especially if you consider the #panel macros, for example. The #panel macros are a) velocity macros, b) defined in files in the filesystem, and c) are included and used by every document defining a panel. Do changes to the rendering/parsing API require skin/template *.vm files to be updated with {velocity} {/velocity} tags? Or would the velocity engine continue to pre-read and cache those files as it does now...

hmm... Are they cached now? I don't see how they could be since it would mean the velocity doesn't get executed at each request. Yes, I would assume we would use {velocity} blocks.

...

A bunch of other random questions: Is there something we have to keep in mind here RE: document content vs. object field content (which already has a flavor of content rendering attribute)?

We need to bring more consistency to this area too but it's another topic I think.

...

Would you have to make script {script} {/script} blocks in textarea content of objects? (I'm thinking Panels, so I'd say yes).

I think yes.

...

Will the page parser panic if you either include the content of a field containing a block within a different content block? e.g. {velocity} .. find an object of type x (with field y) display object field y: {groovy} crazy scripting, man {/groovy} {/velocity}

That will work. The velocity macro will run velocity on the whole block and then pass the result back to wikimodel. Wikimodel will see the {groovy} macro and the groovy macro will be executed.

...

As far as use cases go, mine is just this: A LOT of pages (some of which have an extensive history and a lot of attachments): largest portion have mix of html and wiki; second largest is mix of velocity, html, and wiki; last is pages with groovy (usually on pages also containing html, wiki, AND velocity). That was rather stream-of-consciousness. So tell me if I was confusing.

Good questions ;) Thanks -Vincent

Vincent Massol

6:31 p.m.

New subject: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

On Sep 28, 2007, at 12:05 PM, Vincent Massol wrote: [snip]

...

hmm... Are they cached now? I don't see how they could be since it would mean the velocity doesn't get executed at each request. Yes, I would assume we would use {velocity} blocks.

Actually we don't have to do that. The Velocity macro will be able to output wiki syntax from any text so we can still keep the templates in Velocity exactly as they are now. [snip] -Vincent

Mikhail Kotelnikov

1 Oct 1 Oct

12:51 p.m.

New subject: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

Hi! Excuse me please for the late response... On 9/26/07, Vincent Massol <vincent(a)massol.net> wrote:

...

Yes 2) An XWiki instance will use a single syntax at a time. Once the database

...

is created using that syntax it won't be possible to change it (except by doing an export and reimport). We also need to decide what syntax we use by default OOB. I propose we use the current xwiki syntax for some time and then switch to the Wikimodel one (Common Syntax) later on.

I have no choice - I have to vote for the "CommonSyntax" :-) 3) All pages will be able to be exported to any syntax. Some elements have

...

no equivalent in other syntaxes and when this happens a warning will be displayed and the elements in question ignored.

Yes. You can loose data only when you *export* your data from the CommonSyntax, not when you *import* it. The WikiModel/CommonSyntax support a super-set of elements defined in other wikis. BTW: I think that it would be useful to involve Max Völkel (http://xam.de/) in the discussion about wiki imports/exports. Max is one of authors of the Semantic Media WIki. He proposed a Wiki Interchange Format (WIF) < http://eyaloren.org/pubs/semwiki2006-wif.pdf> which covers the topic. At least it would be useful to know his opinion. 4) Mikhail has agreed to modify wikiModel to add macro block recognition.Thesyntax isn't fully defined yet but it'll be something like:

...

{xxx param1=value1 param2=value2} ... {/xxx} (param1='value value' and param1="value value" will also be supported) This means we'll be able to have a common syntax for xwiki's macros and also for groovy code: {groovy} ... {/groovy} And also for HTML blocks: {html} ... {/html}

It is already done and committed. You can use "embedded" macro blocks like {xxx param1=value1 param2="long value 2" param3='this is a parameter 3'} ... {yyy} ... {/yyy} ... {/xxx} Even embedded elements with the same names are possible: {xxx} ... {xxx} ... {yyy} ... {/yyy} ... {/xxx} ... {/xxx} 5) We need to decide if we want:

...

A) No velocity block but document properties/metadata to tell xwiki to render the page using velocity. A user putting velocity code in a page will have to check a box somewhere to say that this is a velocity page. Pros: * Slightly easier to enter velocity code Cons: * Exception case compared to groovy, macros, etc * User must not forget to check the checkbox saying the page contains velocity code OR B) Velocity blocks same as what exists for groovy/macros/html, namely: {velocity} ... velocity code with wiki syntax allowed {/velocity} Note1: For B) we would allow putting wiki syntax inside the velocity block. Technically we'll apply the velocity rendering and then re-apply wikimodel on the result. Note2: For backward compatibility we can have a config flag ( xwiki.compatibility = 1) that automatically adds the {velocity}{/velocity} marker around the whole page. The only downside is that it'll be as slow as it currently is (actually it'll be faster since wikimodel is going to be faster than radeox) Pros: * Speed. Since we know the blocks that use velocity we can cache all the wiki syntax not inside the velocity/macros/groovy blocks which will speed up considerably the rendering of pages * Consistency with macros and groovy blocks. My preferences goes to B) and I'm proposing to use that.

My +1 for B) 6) Mikhail is going to add support for recognizing XML tags in the wikimodel

...

parser so that onOpenXmlTag()/onCloseXmlTag() events are called in listeners). This is needed for point 7 below.

Hmm... Yes, I said that I'll do it. Technically it is simple (simpler than to add "macro" blocks). But conceptually it breaks everything. Explanations: Imagine that the listener already has the onOpenXmlTag(String name, WikiParameters params)/onCloseXmlTag(String name) methods. Then the text: ---------------------- <table>... This is a content of the table ...</table> ---------------------- will be reported as following: ---------------------- - onBeginParagraph - onOpenXmlTag: => "table" - onEndParagraph - onBeginParagraph - onWord/onSpace => "This is a content of the table" - onEndParagraph - onBeginParagraph - onCloseXmlTag: => "table" - onEndParagraph ---------------------- It means that the calls onOpenXmlTag/onCloseXmlTag will cross the borders of multiple wiki paragraphs. And this is a BIG problem. It means that we have to chose one of the following: (A) Report HTML tags "as is" inside of wiki structural elements; In the example above opening and closing "table" tags are reported in two separate wiki paragraphs. pro: It is the simplest solution. con: if somebody want to treat these elements and create a well formed document then it is up to the him/her to do it by hands and to ignore non-appropriate structural wiki elements; (B) Ignore wiki formatting inside of XML tags. In the example above all wiki paragraphs will be skipped and only HTML "table" tags will be taken into account; in this case WikiModel can not guaranties that each opening element was really closed. pro: It is doable. con: a) It breaks completely the idea of the WikiModel - to give access to a well-formed structure of wiki documents; b) the grammar will be bigger; c) It is not so simple to implement (C) Add some HTML tags as markup elements for the CommonSyntax. In this case each '<table>...</table>' tag pair will be interpreted in the same way as normal wiki tables. The same for "ol/ul/li/dd/dl/dt/p/span/div/..." elements. pro: you can mix your wiki and HTML markup with the same meaning and all of them will be reported in the same way to the listeners. con: a) the grammar will be bigger; b) I have to do much more additional validations of documents by hand to guarantee that the document is well-formed; c) the parsing will be much slower (as the consequence of a and b); d) it is much more difficult to implement; e) the number of possible HTML elements have to be fixed in advance

...

From my point of view neither option is good.

One another approach to resolve this problem - see below, in the response to the next question. 7) We need to allow intermixing velocity/HTML and wiki syntax easily. For

...

this our listener (the code that listens to {velocity} events) will evaluate the content using velocity and will call wiki model again on the resulting code. Since wikimodel requires HTML to be in a block ({html} for us) we'll use a different wikimodel listener that intercepts the onOpenXmlTag/onCloseXmlTag so that it'll output XML tags with no modifications (the standard HTML listener generates < and > for < and

). This will allow writing:

I propose to interpret the content of such blocks in the following manner: - The given text is interpreted as a "relaxed-XML" where all opening XML tags have to be closed (or a tag has to be empty) - If an XML tag has a text content (not only spaces) then this content is interpreted as a wiki syntax. In this case the wiki content can be handled by a normal wiki parser. In this case the scenario of usage can be following: 1. You parse your initial wiki document and create a backbone structure of the document containing macro blocks 2. All macro blocks corresponding to template blocks are handled by the corresponding engine (ie velocity) 3. The output from template engines is used as an input for such a "relaxed-XML" parser 4. A not-empty text content of tags is interpreted as wiki content using separate wiki parser instances; the content formed by these wiki blocks should be inserted in the document from the step 3. pro: * It seems that it is the simplest solution and it resolves problems with inter-mixing of the wiki syntax/XML/HTML from the previous point (if such inter-mixing is available only in macro blocks). * For such a "relaxed-XML" content existing XML parsers can be used. It is possible just to add the "<xml>...</xml>" pair around the content and use directly a normal XML parser. But there is a risk that this document is not a well-formed XML. Or an HTML cleaner (JTidy/NekoHTML/...) can be used to get a well-formed XHTML before the XML parsing. * All steps gives well-formed structures con: More parsers in the chain of the page treatment => the treatment is slower (this can be partially compensated by additional caches in the future) Personally I definitely prefer this approach. And from my point of view it resolves the problems with onOpenXmlTag/onCloseXmlTag mentioned above. 8) Documents are stored in textual format in the DB (i.e. as the user sees

...

them). Portions of them will be cached after they're rendered for the first time (see option B above for the best caching option). Are you ok on these points and especially about using the 5B solution? Anything else I've forgotten? If we agree, then my next steps are: * Understand the wikimodel API in more details * Understand the doxia API too (it's a "competitor" to wikimodel). The reason is that I'd like to see two implementations to ensure that the XWiki Interfaces can be implemented using different implementations so that XWiki is independent of the underlying rendering/parsing framework used. Jason Van Zyl is also interested in implementing the doxia part for XWiki in the future.

I think that in any case it is a good idea to have a specialized interface (API) for each type of functionalities to isolate the core of the system from external libraries/frameworks. BTW: Thank you for the pointer on Doxia! I will see it more in details. After a brief look I saw that WikiModel contains very similar modules with the "Sink" object of Doxia. I think that it is very simple to create a WikiModel/Doxia bridge. And it seems possible to add the APT ( http://maven.apache.org/doxia/references/apt-format.html) format support directly to the WikiModel. Best regards Mikhail * Propose a XWiki API

...

* Propose an integration path * Implement it using WikiModel Thanks -Vincent _______________________________________________ devs mailing list devs(a)xwiki.org http://lists.xwiki.org/mailman/listinfo/devs

Vincent Massol

10:05 p.m.

New subject: [xwiki-devs] [SUMMARY] Re: [Discussion] Designing the new Rendering/Parsing component/API

I've discussed this email below with Mikhail. We discussed 2 points: * How to support the following:

...

The solution we've come to is the following for rendering the above: a) When parsed with wikimodel, the velocity macro is called b) the velocity macro uses an XML parser to parse its content (by adding a top level xml wrapping element for example) c) for each text content found during the XML parsing, call wikimodel on it so that the wiki syntax can be interpreted Same for the {html} macro. We could also use parameters to decide if XML parsing should be done, if wiki syntax should be interpreted, etc. For example: {html xml=true|false wikisyntax=true|false ...}, {velocity xml=true|false wikisyntax=true|false ...}. * How to speed up document rendering? We propose several caches: - level 1: Parse documents using a wikisyntax DOM parser which produces a DOM tree which is cached. This tree is composed of nodes representing macros (unparsed or XML-parsed in a XML DOM tree which is cached). - level 2: Cache the non macro blocks after they are rendered (XHTML) since they are static. - level 3: Caching at the level of the rendered XHTML. This would be a timed-cache (the cache is refreshed every N minutes). This is absolutely required for heavy sites (Imagine Apache.org using XWiki). I think we would need to cache pages for users not logged in at least. Not sure how to cache pages when users are logged in. Note1: level 2 and level 3 caches do not require a DOM tree. Note2: we need the DOM tree to speed up all macros that act on content, like the TOC macro as otherwise it means the content will need to be parsed (which is hard and slow where traversing the DOM tree is easy and fast). Since the DOM tree is cached the rendering for these macros will be fast. Note3: The TOC macro reminds me that we need somehow to support macros that should be rendered last since they operate on the full DOM tree. This is easy to do with a DOM tree but would quite harder if we were only using a stream. Any comment? Do you agree? Any other idea? Thanks -Vincent On Oct 1, 2007, at 12:51 PM, Mikhail Kotelnikov wrote: > Hi! Excuse me please for the late response... > > On 9/26/07, Vincent Massol <vincent(a)massol.net> wrote: > Hi xwiki devs, > > This is a summary of the decisions so far and the remaining > questions. This is also the outcome of my discussion of today with > Mikhail on skype. > > 1) We'll be able to import all syntaxes. > > Yes > > > 2) An XWiki instance will use a single syntax at a time. Once the > database is created using that syntax it won't be possible to > change it (except by doing an export and reimport). > > We also need to decide what syntax we use by default OOB. I propose > we use the current xwiki syntax for some time and then switch to > the Wikimodel one (Common Syntax) later on. > > I have no choice - I have to vote for the "CommonSyntax" :-) > > > 3) All pages will be able to be exported to any syntax. Some > elements have no equivalent in other syntaxes and when this happens > a warning will be displayed and the elements in question ignored. > > Yes. You can loose data only when you *export* your data from the > CommonSyntax, not when you *import* it. The WikiModel/CommonSyntax > support a super-set of elements defined in other wikis. > > BTW: I think that it would be useful to involve Max Völkel (http:// > xam.de/) in the discussion about wiki imports/exports. Max is one > of authors of the Semantic Media WIki. He proposed a Wiki > Interchange Format (WIF) <http://eyaloren.org/pubs/semwiki2006- > wif.pdf> which covers the topic. At least it would be useful to > know his opinion. > > > 4) Mikhail has agreed to modify wikiModel to add macro block > recognition.The syntax isn't fully defined yet but it'll be > something like: > > {xxx param1=value1 param2=value2} > ... > {/xxx} > > (param1='value value' and param1="value value" will also be supported) > > This means we'll be able to have a common syntax for xwiki's macros > and also for groovy code: > > {groovy} > ... > {/groovy} > > And also for HTML blocks: > > {html} > ... > {/html} > > It is already done and committed. You can use "embedded" macro > blocks like > {xxx param1=value1 param2="long value 2" param3='this is a > parameter 3'} > ... {yyy} ... {/yyy} ... > {/xxx} > > Even embedded elements with the same names are possible: > {xxx} > ... > {xxx} > ... {yyy} ... {/yyy} ... > {/xxx} > ... > {/xxx} > > > 5) We need to decide if we want: > > A) No velocity block but document properties/metadata to tell xwiki > to render the page using velocity. A user putting velocity code in > a page will have to check a box somewhere to say that this is a > velocity page. > > Pros: > * Slightly easier to enter velocity code > > Cons: > * Exception case compared to groovy, macros, etc > * User must not forget to check the checkbox saying the page > contains velocity code > > OR > > B) Velocity blocks same as what exists for groovy/macros/html, namely: > > {velocity} > ... velocity code with wiki syntax allowed > {/velocity} > > Note1: For B) we would allow putting wiki syntax inside the > velocity block. Technically we'll apply the velocity rendering and > then re-apply wikimodel on the result. > Note2: For backward compatibility we can have a config flag > ( xwiki.compatibility = 1) that automatically adds the {velocity}{/ > velocity} marker around the whole page. The only downside is that > it'll be as slow as it currently is (actually it'll be faster since > wikimodel is going to be faster than radeox) > > Pros: > * Speed. Since we know the blocks that use velocity we can cache > all the wiki syntax not inside the velocity/macros/groovy blocks > which will speed up considerably the rendering of pages > * Consistency with macros and groovy blocks. > > My preferences goes to B) and I'm proposing to use that. > > My +1 for B) > > > 6) Mikhail is going to add support for recognizing XML tags in the > wikimodel parser so that onOpenXmlTag()/onCloseXmlTag() events are > called in listeners). This is needed for point 7 below. > > Hmm... Yes, I said that I'll do it. Technically it is simple > (simpler than to add "macro" blocks). But conceptually it breaks > everything. > Explanations: Imagine that the listener already has the onOpenXmlTag > (String name, WikiParameters params)/onCloseXmlTag(String name) > methods. > Then the text: > ---------------------- > <table>... > > This is a content of the table > > ...</table> > ---------------------- > > will be reported as following: > ---------------------- > - onBeginParagraph > - onOpenXmlTag: => "table" > - onEndParagraph > > - onBeginParagraph > - onWord/onSpace => "This is a content of the table" > - onEndParagraph > > - onBeginParagraph > - onCloseXmlTag: => "table" > - onEndParagraph > ---------------------- > > It means that the calls onOpenXmlTag/onCloseXmlTag will cross the > borders of multiple wiki paragraphs. And this is a BIG problem. > It means that we have to chose one of the following: > (A) Report HTML tags "as is" inside of wiki structural elements; In > the example above opening and closing "table" tags are reported in > two separate wiki paragraphs. > pro: It is the simplest solution. > con: if somebody want to treat these elements and create a well > formed document then it is up to the him/her to do it by hands and > to ignore non-appropriate structural wiki elements; > (B) Ignore wiki formatting inside of XML tags. In the example above > all wiki paragraphs will be skipped and only HTML "table" tags will > be taken into account; in this case WikiModel can not guaranties > that each opening element was really closed. > pro: It is doable. > con: a) It breaks completely the idea of the WikiModel - to give > access to a well-formed structure of wiki documents; b) the grammar > will be bigger; c) It is not so simple to implement > (C) Add some HTML tags as markup elements for the CommonSyntax. In > this case each '<table>...</table>' tag pair will be interpreted in > the same way as normal wiki tables. The same for "ol/ul/li/dd/dl/dt/ > p/span/div/..." elements. > pro: you can mix your wiki and HTML markup with the same meaning > and all of them will be reported in the same way to the listeners. > con: a) the grammar will be bigger; b) I have to do much more > additional validations of documents by hand to guarantee that the > document is well-formed; c) the parsing will be much slower (as the > consequence of a and b); d) it is much more difficult to > implement; e) the number of possible HTML elements have to be fixed > in advance > > From my point of view neither option is good. > > One another approach to resolve this problem - see below, in the > response to the next question. > > 7) We need to allow intermixing velocity/HTML and wiki syntax > easily. For this our listener (the code that listens to {velocity} > events) will evaluate the content using velocity and will call wiki > model again on the resulting code. Since wikimodel requires HTML to > be in a block ({html} for us) we'll use a different wikimodel > listener that intercepts the onOpenXmlTag/onCloseXmlTag so that > it'll output XML tags with no modifications (the standard HTML > listener generates < and > for < and >). This will allow > writing: >

...

> > I propose to interpret the content of such blocks in the following > manner: > - The given text is interpreted as a "relaxed-XML" where all > opening XML tags have to be closed (or a tag has to be empty) > - If an XML tag has a text content (not only spaces) then this > content is interpreted as a wiki syntax. In this case the wiki > content can be handled by a normal wiki parser. > > In this case the scenario of usage can be following: > 1. You parse your initial wiki document and create a backbone > structure of the document containing macro blocks > 2. All macro blocks corresponding to template blocks are handled by > the corresponding engine (ie velocity) > 3. The output from template engines is used as an input for such a > "relaxed-XML" parser > 4. A not-empty text content of tags is interpreted as wiki content > using separate wiki parser instances; the content formed by these > wiki blocks should be inserted in the document from the step 3. > > pro: > * It seems that it is the simplest solution and it resolves > problems with inter-mixing of the wiki syntax/XML/HTML from the > previous point (if such inter-mixing is available only in macro > blocks). > * For such a "relaxed-XML" content existing XML parsers can be > used. It is possible just to add the "<xml>...</xml>" pair around > the content and use directly a normal XML parser. But there is a > risk that this document is not a well-formed XML. Or an HTML > cleaner (JTidy/NekoHTML/...) can be used to get a well-formed XHTML > before the XML parsing. > * All steps gives well-formed structures > con: More parsers in the chain of the page treatment => the > treatment is slower (this can be partially compensated by > additional caches in the future) > > Personally I definitely prefer this approach. And from my point of > view it resolves the problems with onOpenXmlTag/onCloseXmlTag > mentioned above. > > 8) Documents are stored in textual format in the DB (i.e. as the > user sees them). Portions of them will be cached after they're > rendered for the first time (see option B above for the best > caching option). > > Are you ok on these points and especially about using the 5B solution? > > Anything else I've forgotten? > > If we agree, then my next steps are: > * Understand the wikimodel API in more details > * Understand the doxia API too (it's a "competitor" to wikimodel). > The reason is that I'd like to see two implementations to ensure > that the XWiki Interfaces can be implemented using different > implementations so that XWiki is independent of the underlying > rendering/parsing framework used. Jason Van Zyl is also interested > in implementing the doxia part for XWiki in the future. > > I think that in any case it is a good idea to have a specialized > interface (API) for each type of functionalities to isolate the > core of the system from external libraries/frameworks. > > BTW: Thank you for the pointer on Doxia! I will see it more in > details. After a brief look I saw that WikiModel contains very > similar modules with the "Sink" object of Doxia. I think that it is > very simple to create a WikiModel/Doxia bridge. And it seems > possible to add the APT ( http://maven.apache.org/doxia/references/ > apt-format.html) format support directly to the WikiModel. > > Best regards > Mikhail > > * Propose a XWiki API > * Propose an integration path > * Implement it using WikiModel > > Thanks > -Vincent

6672

days inactive

6692

days old

xwiki-devs@xwiki.org

Manage subscription

33 comments

7 participants

tags (0)

participants (7)

Erin Schnabel
Fabio Mancinelli
Jean-Vincent Drean
Mikhail Kotelnikov
Sergiu Dumitriu
Stéphane Laurière
Vincent Massol