Re: [xwiki-devs] [Discussion] Designing the new Rendering/Parsing component/API

21 Sep 2007

On Sep 21, 2007, at 12:03 PM, Mikhail Kotelnikov wrote:

[snip]

...
  Note: Velocity or Groovy scripts can generate wiki
syntax content  
 and thus these would need to generate new DOM elements. Not sure  
 how easy that would be.
 I think that with Groovy it can be even simper to work nodes than  
 with Velocity.
 An example:
 -------------------------------------
 // Velocity:
 -------------------------------------

 Hello $customer.Name!

 #foreach( $mud in $mudsOnSpecial )
 #if ( $customer.hasPurchased($mud) )

 #end
 #end

 $flogger.getPromo( $mud )

 -------------------------------------
 // Groovy:
 // (see http://groovy.codehaus.org/GroovyMarkup ,
 // http://groovy.codehaus.org/Builders)
 -------------------------------------
 def xml = new MarkupBuilder()
 xml.div() {
 h1("Hello, ${$customer.Name}!")
 table() {
 for (mud in mudsOnSpecial) {
 if (customer.hasPurchased(mud)) {
 tr(){ td( flogger.getPromo(mud) ) }
 }
 }
 }
 }
 ------------------------------------- 
Sorry for the formatting loss... I couldn't find a way to keep it  
with my mail client...

I still find the velocity version clearer in your example and I think  
it's way way simpler for non developers.

...
  From my point of view the second example is even
simpler then the  
 Velocity-based one.
 The advantages of the Groovy-based stuff:
 - It generates a well-formed HTML (you have no choice, it is done  
 automatically :-))
 - It can be compiled directly to the Java bytecode and cached
 - It is much more powerful then Velocity
 - You don't have to learn 2 stuff at the same time - Velocity and  
 Groovy 
Currently 90% (a figure I made up ;-)) of xwiki users who are using  
some scripts only learn a single scripting language: velocity :)

Groovy is for developers. It's more powerful definitely but it's for  
developers.

I don't think we can have a single scripting language and I really  
don't think we should have one. I'd rather we support several:  
groovy, velocity, jython, beanshell, jruby,etc.

In addition with Velocity we control the API we offer to users,  
limitating the security issues whereas with Groovy we can't do that.

...
  - It is possible to write WikiModel specific builders
which will be  
 much more efficient then the generic MarkupBuilder from the example  
 above. It will be possible to manipulate with, say, tables in the  
 following way: table[i][j] = "Hello!";

 I think that in any way you have to be a geek to write a  
 template :-). You have to understand at least some notions like  
 "variables", "if" conditions, "for" cycles and so on. And
IMHO it  
 is simpler to use these structures in a normal programming   
 language. And if you are a "normal" user then for you even Velocity  
 templates are completely unreadable. 
I don't agree. There are different levels of users and there's a  
level that don't know how to program in a full fledged language like  
Java or Groovy but who know how to do simple thing like:

$xwiki.searchDocuments("...")

The reason Velocity is successful is because it's simple and has  
always resisted the temptation to do complex stuff.

...
  Another aspect is that if you have errors in your
Velocity template  
 it don't save you from exceptions. It it doesn' work in the same  
 way as a bad-written groovy code :-). 
[snip]

...
  Now you mention removing Velocity. This won't be
possible since all  
 current XWiki instances used are using Velocity and we cannot tell  
 our users that they have to rewrite all their pages if they want to  
 move to XWiki v1.3. We'll need to continue supporting Velocity for  
 some time. Personally I currently find that the velocity syntaxes  
 mixes much better with the wiki syntax than groovy. If you look at  
 contributed code snippets you'll see that most are in Velocity  
 which is what most people use.

 In any case if you change the syntax you will have to process your  
 scripts as well. 
We don't change the syntax! We allow other syntaxes. We definitely  
need to keep the current syntax for a long time to come I think.

But I thought this was the main point of using WikiModel: the ability  
to support several syntaxes :)

...
  About the usage of Groovy and templating... I
don't like at all the  
 groovy templates where the syntax like <% if (...){%>Hello<%}%>  is  
 used. 
ok that makes 2 of us ;)

...
  It is "inspired" by bad-styled
JSP/ASP/PHP... But I really like  
 Groovy builders, as I wrote above. And I have impression  that  
 these builders can easily replace Velocity (or other templates). 
I wouldn't replace it. It can be used in addition to Velocity.

...
  Now you mention other stuff about Jasper and Jetty but
I'm not sure  
 I have understood that part.

 I thought that it would be possible to use the JSP syntax to create  
 templates. So each wiki page can be considered as a jsp page (maybe  
 - if it contains some specific markers in the content) and it can  
 be parsed and compiled as a JSP. The advantage:
 - And, especially, usage of standard tag libraries (like <c:forEach  
 items="${addresses}" var="address">...</c:forEach>) 
Well I don't see that example as an advantagee ;)

#foreach ($address in $addresses)

definitely sounds better to me ;)

But I understand your point. Just not sure yet how beneficial this  
would be.

...
  The standard tag libraries gives the same
functionalities (and much  
 more) than Velocity. The advantages: it is a standard, it can be  
 compiled directly in java, ...
 - Usage of multiple languages (javascript, jpython, jruby,  
 groovy...) if you use the syntax like "<% if (...) {%>Hello!<%}%>  
 (I hate this coding style!)

 Personally I would like to see the Groovy templating much more than  
 JSP-based one. It was just a proposal... 
I'm fine with groovy templates but not by removing Velocity. Rather,  
in addition to it.

What do others think?

Thanks
-Vincent

...
  On Sep 19, 2007, at 6:54 PM, Mikhail Kotelnikov
wrote:

  Hello!

 Just some words about what the wiki model is and what it is not.

 The main goal of the WikiModel is the creation of an API giving  
 access and control to the internal structure of individual wiki  
 documents.

 Some features of the WikiModel:
 - WikiModel itself does not depend on any particular wiki syntax
 - The number of possible structural elements and their possible  
 assembling order is strictly fixed (which greatly simplifies the  
 validation and manipulation) but the final result is almost as  
 expressive as XHTML (and even more expressive, taking into account  
 notions of properties and embedded documents which can recursively  
 contain their own embedded documents :-)).
 - WikiModel manipulates with a super-set of structural elements  
 available in existing wikis. And it has some features not  
 available in other wikis. For example using embedded documents in  
 WikiModel it is possible to put a table in a list and this table  
 can contains its own headers, paragraphs, and lists... Or using  
 embedded documents with the notion of properties it is possible to  
 define very complex structured objects directly on a wiki page.
 - There is at least one wiki syntax ("Common" syntax) giving  
 access to all features of the Wiki Model. This syntax guaranties  
 that all structural elements of the WikiModel can be serialized/de- 
 serialized without loose of information and structure. Using any  
 other syntaxes can lead to the information lost (example: you can  
 not put table in a table in XWiki or in JSPWiki which is possible  
 using the Common Syntax).
 - One of the goals of the WikiModel is to give a mean to *import*  
 information from various wiki engines without information lost.  
 The structure of documents can be serialized in various wiki  
 syntaxes as well, but there is no guaranties that some information  
 will not be lost. The information can be lost in the case when a  
 document contains some elements which have no representation in a  
 particular wiki syntax. Example: properties; tables  in lists;  
 parameters of lists, paragraphs, and tables and so on...
 - All elements managed by the WikiModel can be serialized/ 
 deserialized using XHTML with additional annotations (microformat- 
 like annotations)

 Some features of the CommonSyntax:
 - It is a native syntax for the WikiModel. It provides full access  
 to all features of the WikiModel.  All structures in the WikiModel  
 can be serizlized/deszerialized in this syntax without any  
 information lost
 - It uses markup characters available in most (in ideal situation  
 - in all) keyboard layouts (including Russian :-)). So you don't  
 have to switch keyboard layouts to write text, tables, lists and  
 headers. For example tables can be defined using pipe symbols ("|"  
 - which is not available in many keyboard layouts) or the "::"  
 sequence.
 - If there is a choice then the most commonly used markups are used

 The current version  of the WikiModel provides just an event-based  
 interface  to  work with the structure of documents (like SAX for  
 XML).
 In previous versions of WIkiModel I had Document Object Model in  
 which each structural element had its own object representation.  
 In the current version an Object Model is not implemented (yet). I  
 thought to create just a set of utility classes manipulating with  
 the standard XML DOM. Example: the method WikiTable#setCellContent 
 (int row, int column, String content) should create an XHTML table  
 object, create the required number of cells and columns and put  
 the given string content in this node. The same for all other  
 structural elements (headers, lists, internal documents,  
 properties, styles, macros...)

 On 9/14/07, Vincent Massol &lt;vincent(a)massol.net > wrote: +1 to all  
 that. So let me summarizes and rephrase to see if I have
 understood :)

 1) We have 4 types of objects:
 * TextProcessors: take text and generate text
 * Parsers: take text and generate an internal DOM format (pivot  
 format)
 * DomProcessors: take DOM and generate DOM
 * Renderers: take DOM and generate anything (text, PDF, RTF, HTML,
 XML, etc)

 Yes.

 2) Document contents are stored in the database in textual format in
 the main xwiki syntax (whatever we decide it is - we could
 standardize on creole for example)

 It can be the "Common Syntax" for the reasons mentioned above :-).  
 Creole syntax is one of the most restrictive syntaxes. And I tried  
 to uses in the CommonSyntax as much markups of the Creole as  
 possible.

 An another possibility is to store directly in XML or in XHTML 
 +microformat enhancements (for additional structural elements).
 pro:
 - it can be exported/imported directly and used by external  
 applications which knows nothing about wikis; just a standard XML  
 or XHTML
 - this content can be transformed with XSLT processors directly  
 without usage of the WikiModel
 - it can be faster to parse XML than the CommonWiki syntax (I have  
 no comparisons)
 con:
 - it is more difficult to work with diffs (but for diffs it is  
 *better* to use WkiModel and to generate a specific wiki syntax;  
 for example "Common syntax");
 - it is not a "human readable" format; it is difficult to  
 understand what you loads from the DB

 3) Use case 1: Viewing a document

 a) Get the doc from the DB --> text1 (xwiki text format)
 b) Apply TextProcessors --> text2
 c) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an
 internal DOM)
 d) Apply DomProcessors --> DOM2
 e) Call the required Renderer --> PDF, XML, HTML, RTF, text, etc

 Yes.

 4) Use case 2: Editing a document, assuming the user wants to use the
 MediaWiki syntax for editing

 a) Get the doc from the DB --> text1 (xwiki text format)
 b) Call XWikiParser --> DOM1 (transforms XWiki text syntax into an
 internal DOM)
 c) Call MediaWikiRenderer --> text2 (text in MediaWiki format)
 d) the user edits and hits save
 e) MediaWikiParser --> DOM2 (transforms MediaWiki text syntax into
 the internal DOM)
 f) Call XWikiRenderer --> text" (transforms DOM into xwiki textual
 format)
 g) Save text3 in the database

 Yes. (text1 and text3 can be XML, as I said above)

 5) In practice this means the following classes:

 * TextProcessorManager: to chain several text processors

 Yes. But it can be just a composite processor implementing the  
 same ProcessorManager interfaces.

 * TextProcessor
    - VelocityTextProcessor
    - GroovyTextProcessor

 Yes.

 * WikiParser: Takes wiki syntax and generates a DOM in a XWiki-
 specific format (independent of the different wiki syntaxes).
    - LegacyXWikiWikiParser
    - XWikiWikiParser (or simply use CreoleWikiParser if we want our
 internal format to be Creole)
    - ConfluenceWikiParser
    - MediaWikiWikiParser
    - JSPWikiWikiParser
    - CreoleWikiParser
    - HTMLParser: I think all parsers above need to support HTML since
 the wiki syntaxes can be mixed with HTML. So this HTMLParser is
 probably a parent of the other parsers in some regard. Anyway we need
 this one for the WYSIWYG editor which may need to transform HTML to
 wiki syntax (so we may need a XWikiDomProcessor too to transform into
 XWiki syntax). The alternative (much better) is to have the WYSIWYG
 editor only use the internal XWiki-specific DOM format for all its
 manipulations.

 If you want, you can put HTML as a non-interpreted block  
 ("verbatim blocks") and interpret it in the client code. But  
 internally the WikiModel does not support "embedded" (X)HTML. The  
 main reason: in this  case I loose control of the document  
 structure. And this control is the main goal of the WikiModel.

 * DomProcessorManager: to chain several DOM processors
 * DomProcessor
    - Don't know yet what we're going to use this for. TOCDomProcessor
 as you say above maybe.

 DOMProcessor can be used to transform the original DOM object  
 representing the document in the DB into a new (user and query- 
 specific) DOM object which can contain new elements, generated  
 dynamically. Now all dynamic page elements are interpreted as  
 simple Velocity or Groovy scripts and they generate text documents  
 which should be parsed using Radeox and transformed to the final  
 HTML document. Using the DOM representation it is possible to  
 interpret some nodes of this graph as Groovy scripts. In WikiModel  
 they will correspond to Verbatim blocks which are opaque for  
 WikiModel but they can be interpreted as scripts by the  
 DomProcessor(s). And these "Groovy"-nodes can be executed and they  
 will add new DOM elements to the DOM2. For example this approach  
 can be used to generate search results.

 The advantages of this approach:
 - You can put your parsed document DOM1 in the cache, which will  
 avoid you to to parse the document for each query. It is a slowest  
 step in the page processing. Even if the current version of  
 WikiModel is faster than before and it should be faster than  
 Radeox processor.
 - Your Groovy scripts will manipulate with normal java classes  
 (DOM nodes) and it will produce DOM nodes and not a plain text. It  
 seems especially interesting taking into account Groovy's Builders  
 ( http://groovy.codehaus.org/Builders). It is enough to write a  
 very simple builder (see http://groovy.codehaus.org/ 
 BuilderSupport ) generating DOM nodes and ... voila! Your Groovy  
 node from a wiki page generates search results as DOM nodes!   
 These manipulations with DOM objects should be MUCH faster that  
 process plain text for every request. And all following steps are  
 fast as well - to generate an HTML page it is enough to visit all  
 node with an "XHTMLVisitor".

 BTW: do you need Velocity at all? Using only Groovy is much  
 cleaner. It can be used as THE language of XWiki. It  can be used  
 as a template *and* programming language at the same time. And if  
 you *really* want it is possible to integrate Jasper (from Tomcat)  
 engine to use it for pure templating. The code from Jetty (th e  
 org.mortbay.jetty.jspc.plugin package) can be used as an example  
 of integration with Jasper (see http://jetty.mortbay.org/xref/ 
 index.html).
 In this case in templates it will be possible to use:
 - JSP tag libraries (including standard ones)
 - Multiple scripting languages (like javabeans, javascript,  
 jpython, jruby, groovy,...)

 * Renderer
    - XMLRenderer
    - HTMLRenderer
    - PDFRenderer
    - RTFRenderer
    - XWikiRenderer (or simply use CreoleRenderer if we want our
 internal format to be Creole)
    - ConfluenceRenderer
    - MediaWikiRenderer
    - JSPWikiRenderer
    - CreoleRenderer

 Yes. All these renderers should be written if you want to support  
 all these syntaxes. I think that it should not be very difficult.

 WDYT? Do I have it right? :)

 Best regards,
 Mikhail

 Thanks
 -Vincent

 On Sep 13, 2007, at 6:37 PM, StÃ©phane LauriÃ¨re wrote:

  Hi Vincent, hi everyone,

 We discussed the WikiModel integration with Mikhail this afternoon.
 Here
 is below our input.

 Vincent Massol wrote:
> Hi,
>
> I've started working on designing the new Rendering/Parsing
> components and API for XWiki. The implementation will be based on
> WikiModel but we need some XWiki wrapping interfaces around it.    Note
 >  that this is a prerequisite for the new
WYSIWYG editor based    on GWT
 >  (see
http://www.xwiki.org/xwiki/bin/view/Design/
> NewWysiwygEditorBasedOnGwt).
>
> I've updated http://www.xwiki.org/xwiki/bin/view/Design/
> WikiModelIntegration with the information below, which I'm pasting
> here so that we can have a discussion about it. I'll    consolidate the
    results
on that wiki page.

 Componentize the Parsing/Rendering APIs
 ==================================

 We need 4 main components:

 * A Scripting component to manage scripting inside XWiki documents
 and to evaluate them. 
 On the topic of scripting we would like to propose a distinction
 between
 scripts that act on text and scripts that act on the DOM.
 Typically, the
 text rendering processing for flow would be the following, for say
 "text1":

 text1 =TextProcessor=> text2 =Parser=> dom1 =DomProcessor=> dom2
 => ...

 - the scripts contained in text1 are processed in the context of
 user1,
 this results into a new text: text2
 - the parser parses text2 and converts text2 to a DOM tree, dom1
 - dom1 is processed by scripts that work directly on the DOM    (example:
  table of content generator), this results in
dom2
 - dom2 is made to available as such or is converted to XML,    HTML, PDF
  etc. depending on the user request

 TextProcessor and DomProcessor would have the following interfaces:

 TextProcessor
 - String execute(String content)

 DomProcessor
 - DOM execute(DOM content)

 That means we should have a syntax to distinguish between    scripts that
  generate text content, and scripts that
manipulate the DOM.

>      * A Rendering component to manage rendering Wiki syntax into
> HTML and other (PDF, RTF, etc)
>      * A Wiki Parser component to offer a typed interface to XWiki
> content so that it can be manipulated
>      * A HTML Parser component (for the WYSIWYG editor)
>
> Different Syntaxes ===============
>
> Two possible solutions:
>
>     1. Have a WikiSyntax Object (A simple class with one    property: a
 > combox box with different syntaxes: XWiki
Legacy, Creole,    MediaWiki,
 > Confluence, JSPWiki, etc) that users can
attach to pages to    tell the
 > Renderers what syntax is used. If no such
object is attached then
> it'll default to XWiki's default syntax (XWiki Legacy or Creole    for
 > example).
>     2. Have some special syntax, independent of the wiki    syntaxes to
 > tell the Rendered that such block of content
should be rendered    with
   that
given syntax. Again there would be a default.

 Here's our view regarding the syntax used in wiki edit mode:    document
  requested for edition are available from the
database in a    serialized
  format, for instance XHTML. When entering into
the edit action, the
 user
 indicates his preferred syntax. If the text of the requested    document
  contains some blocks that are not handled by the
chosen syntax, the
 user
 gets a warning (example: the document contains a table as a list    item,
  and the user tries to edit the document using
JSPWiki syntax).    If not,
  WikiModel converts the serialized format into a
DOM, the user edits
 the
 DOM and the WikiModel serializer serializes it back when the user
 saves it.

 Note that the DOM representation of wiki documents in the latest
 version
 of WikiModel is still pending.

 XWiki Interfaces
 =============

      * ScriptingEngineManager: Manages the different Scripting
 Engines, calling them in turn.
      * ScriptingEngine
            o Method: evaluate(String content)
            o Implementation: VelocityScriptingEngine
            o Implementation: GroovyScriptingEngine
      * RenderingEngineManager: Manages the different Rendering
 Engines, calling them in turn.
      * RenderingEngine
            o Method: render(String content)
            o Implementation: XWikiLegacyRenderingEngine (current
 rendering engine)
            o Implementation: WikiModelRenderingEngine
      * Parser: content parsing
            o HTMLParser: parses HTML syntax
            o WikiParser: parses wiki syntax
            o Implementation: WikiModelHTMLParser
            o Implementation: WikiModelWikiParser

 Open Questions:

      * Does WikiModel support a generic syntax for macros? 
 WikiModel generates events for blocks that are not to be parsed
 (typically because they contain scripts).

 For example, in the WikiModel syntax currently called    "CommonSyntax",
  this looks like the following:
 ==============
 {{{macro:mymacro (String parameters)
 dothis
 dothat

 }}}

 $mymacro(parameters)
 ==============

 For each syntax, macro blocks are identified as far as possible (we
 still have to check it's the case for all types of macro blocks    inde
  indeed).

>      * Is the Rendering also in charge of generating PDF, RTF,
> XML, etc?
>            o I think so, need to modify interfaces above to    reflect
   this.
      * The WikiParser needs to recognizes scripts since this is
 needed for the WYSIWYG editor. 
 the WikiModel parser recognizes scripts indeed.

 Mikhail and StÃ©phane

>
> Use cases
> ========
>
>      * View page
>            o ViewAction -- template ->
> ScriptingEngineManager.evaluate
> () -- wiki syntax -> RenderingEngineManager.render() ---> HTML,    XML,
 > PDF, RTF, etc
>      * Edit page in WYSIWYG editor
>            o Uses the WikiParser to create a "DOM" of the page
> content and to render it accordingly. NOTE: This is required since
> rendering in the WYSIWYG editor is different from the final
> rendering. For example, macros need to be shown in a special    way to
 > make them visible, etc.
>            o Changes done by the user are entered in HTML.    Note: it
 > would be better to capture them so that they
are entered in the
> "DOM". Is that possible? If not, then the HTMLParser is used to
> convert from HTML to Wiki Syntax but they're likely be some    loss in
 > the conversion. The advantage is the ability
to take any HTML    content
 > and generate wiki syntax from it.
>
>
> This is my very earlier thinking but I wanted to make it    visible to
 > give everyone the change to 1) know
what's happening and 2)    suggest
 >> ideas.
 >>
 >> I'll refine this in the coming days and post again on this thread.
 >> 
 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs

 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-devs] [Discussion] Designing the new Rendering/Parsing component/API