Re: [xwiki-devs] [Gsoc -- 2009] Questions concerning the Import Export from any other Wiki Project

2 Apr 2009

Hello Vincent,
2009/4/2 Keerthan MUTHURASA &lt;muthurasa.keerthan(a)gmail.com&gt;
...
  Hello Vincent,
 2009/4/2 Vincent Massol &lt;vincent(a)massol.net&gt;

 On Apr 2, 2009, at 1:04 PM, Keerthan MUTHURASA wrote:
  Hello Vincent,
 Thank you for your reply.
 2009/4/2 Vincent Massol &lt;vincent(a)massol.net&gt;
  Hi Keerthan,
 On Apr 2, 2009, at 2:59 AM, Keerthan MUTHURASA wrote:
> Hello,
>
> I now have a good understanding of existing works regarding Parser
> and
> Renderer for differents wiki syntax within
> wikiModel.
> I am coming with some  new questions. Does Xwiki provide some kind
> of API
> for retrieving a remonte page
> from another wiki  just like using XWikiXmlRpcClient  to copy Xwiki
> Pages (
> I followed some previous discussion
> about Xwiki Batch).
 That completely depends on the remote wiki. Some wikis supports
 XMLRPC
 (confluence for example and it's even compatible with the XWiki
 XMLRPC
 API). For other wikis the best will be for the user to do an export
 of
 their wiki (mediawiki is in this category I think - they have tools
 to
 export the full wiki as a big zip). There are also libraries that
 exist. For example found this one a few days ago:
 http://code.google.com/p/gwtwiki/wiki/MediaWikiAPISupport

 Thank you.
>
> You should also have a look at the following which contains lots of
> information:
> 
 http://confluence.atlassian.com/display/CONFEXT/Universal+Wiki+Converter

 I already had a look, Guillaume sent it to me in one of  the previous
 thread.
 In order to perform export or import  , the created toolI can follow
 the
 next few step :
           Step 1: Ask for operation type ( export or import)
           Step 2: Ask for information about the source wiki ( wikiType
 (mediaWiki,Confluence...), address,username, password).In case of the
                      operation is export we know that the wiki type is
 Xwiki.
           Step 3: Ask information about page(s) to be exported
           Step 4: Ask for information about the destination wiki
 (address,username, password).In case the operation is export we
 don't need
 information
                      about address, username,password. 
 I don't understand why we need steps 1 to 4. For me it's up to the
 user to perform the export for his wiki. I really don't see how we can
 help him do this. It would be really a lot of work to provide a common
 export interface for all wikis. I don't think it's doable and
 maintainable.

 Apologize for my explanation, It was not quite clear.Step 1 to 4 are not
 all about providing

Apologize for my explanation.It was not quite clear.Steps 1 to 4 are not all
about providing
a common export interface because you are right it's not maintainable.
User is responsible for performing export for his wiki but we can try to
help him doing so.If the user's
wiki provide a tools for export why not use this tool in our export import
tool  ?  That was my idea.
We must also let the possiblity to user to provide the input file when he
wants to use any
other tool to export his wiki.
We can also stay open to several format ,it's can be XML format , Zip  for
the same wiki in a such a way that
we can easily adapt to any future export format for a given wiki.For
instance  if tommorow a new export
tool if available for MediaWiki  we just have to extend our already existing
tool to deal with this format.
The project aims is to make sure that after processing the input data , it's
match the expected format for the parser.
Input file (XML,Zip....)     => Processing to convert in a format that the
parser expect => Parser expected Format
The scope of the project can be to find a suitable format in input and then
pass it to the parser after checking that the format is valide.
The other ideas I was talking about are just some extra idea but you're
probably right it will take some more time to do so.I should maybe just
focus on what is necessary.
Does all this make sense ?  Please do give me your advice.
...

  IMO for each wiki we need to find the best export
format we can use to
 import the data and document what the user has to do to perform the
 export.

 Yep , I got the idea.
...

  Then the import tool should probably ask for the
input source type
 (Mediawiki Blob1, Mediawiki Whatever2, Confluence Zip, etc) and use
 the corresponding implementation to read the data and pass it to the
 parsers.
 Now I haven't thought much about this so maybe you have some better
 idea that I don't understand.
 Thanks
 -Vincent

           If Import
                      Step 5: Perform import (1)
           If Export
                      Step 5: Perform export (2)
 Does the following make sense ? I will give more information about
 (1) and
 (2) in the proposal.
 Thanks,
 Regards,
 Keerthan
 > Use Case 1:
> User provide URL to a wiki page in another wiki and  press the
> export
> button.
> The expected result would be :
> a) the source wiki provide an export tools then we can use this
> tools to
> export
> b) the source wiki does not provide an export tools then in this
> case we
> have to find a way to export
>
> After exporting we need to get the remote page and then wikiModel
> API in
> order to parse and convert into a proper format to the
> destination wiki.
 I don't think we need this use case of exporting a single wiki page.
 It's easier to just copy paste it inside a xwiki page.

 Sure I understand.

> Use Case 2:
> The user provide an exported XML from a destination wiki.
> In this no problem with retrieving remotely the page.
 I don't think it should be restricted to XML. Each wiki will have its
 own export format.

 I said XML because I was looking at mediaWiki and get some xml file
 but I
 understand the input format can be
 of any kind.

> Are they anything wrong with this approach ? Many thanks
 Thanks
 -Vincent
 PS: I haven't followed the past discussion so I may be repeating what
 was already said... sorry if that's the case.

 No prob.
>
>> Regards,
>> Keerthan
>>
>>
>>
>> 2009/3/31 Keerthan MUTHURASA &lt;muthurasa.keerthan(a)gmail.com&gt;
>>
>>> Hello Guillaume,
>>>
>>> Thanks a lot for your quick answer.
>>>
>>> 2009/3/31 Guillaume Lerouge &lt;guillaume(a)xwiki.com&gt;
>>>
>>> Hi Keerthan,
>>>>
>>>> On Tue, Mar 31, 2009 at 1:21 AM, Keerthan MUTHURASA <
>>>> muthurasa.keerthan(a)gmail.com&gt; wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> Many thanks for all these helpfull details.
>>>>>
>>>>> 2009/3/30 Guillaume Lerouge &lt;guillaume(a)xwiki.com&gt;
>>>>>
>>>>>> Hi Keerthan,
>>>>>>
>>>>>> thanks for your interest in XWiki & the GSoC. I'll try
answering
>>>>>> some
>>>> of
>>>>>> your questions below.
>>>>>
>>>>>
>>>>>> On Sat, Mar 28, 2009 at 9:30 PM, Keerthan MUTHURASA <
>>>>>> muthurasa.keerthan(a)gmail.com&gt; wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I am Keerthan Muthurasa , Msc Software Engineering student at
>>>>>>> Oxford
>>>>>>> Brookes
>>>>>>> University , I am interested on
>>>>>>> doing a project for xwiki.
>>>>>>>
>>>>>>> I would like to discuss about my ideas for the "Import
Export
>>>>>>> from
>>>> any
>>>>>>> other
>>>>>>> Wiki Project" and if you could in return give me your
opinions
>>>>>>> that would be really helpfull for me.
>>>>>>> This is the project requirement :
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
> 
____________________________________________________________________________________________________________
 >>>>>>> Import Export from
any other
>>>>>>> Wiki<
>>>>>>>
>>>>>>
>>>>>
>>>>
> 
http://dev.xwiki.org/xwiki/bin/view/GoogleSummerOfCode/ImportExportfromanyo…
 >>>>>>>>
>>>>>>>
>>>>>>> Create a extensible framework to import export data between
>>>>>>> wikis.
>>>>> This
>>>>>>> should handle converting the data in the pages including
>>>>>>> links between pages and metadata as well as direct access to
>>>>>>> the
>>>> data
>>>>>>> through either a web service (prefered) or database or the
>>>>>>> file system
>>>>>>>
>>>>>>> The system should at least for MediaWiki and Confluence in
>>>>>>> import
>>>> mode
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
> 
____________________________________________________________________________________________________________
 >>>>>>>
>>>>>>> I will begin with some questions:
>>>>>>>
>>>>>>> * What does it mean when talking about converting links
between
>>>> pages (
>>>>>> we
>>>>>>> are talking about converting internal links in the source
wiki
>>>>>>> isn
>>>> it
>>>>> ?,
>>>>>>> That's mean when importing or exporting data we should
think
>>>>>>> about
>>>>>>> exporting
>>>>>>> or importing the linked data as well in order to keep an
>>>>>>> integrity).
>>>>>>
>>>>>>
>>>>>> Indeed. Most of the time, the use case will be to import a full
>>>>>> wiki
>>>>> rather
>>>>>> than subparts, thus links would be preserved. If you want to let
>>>>>> users
>>>>>> import/export only subarts of a wiki (such as a space or a
>>>>>> single
>>>> page),
>>>>>> you
>>>>>> should provide them with a warning that some links will be
>>>>>> broken
>>>> rather
>>>>>> than trying to import all pages that are linked to. Or you could
>>>>>> make
>>>>>> importing liked to pages an option. It could result in surprised
>>>>>> users
>>>> if
>>>>>> someone tries to export / import one page and ends up with the
>>>>>> 76
>>>> pages
>>>>>> that
>>>>>> page linked / was linked to ;-)
>>>>>
>>>>>
>>>>> I understand, I will keep in mind these details.
>>>>>
>>>>>
>>>>>>
>>>>>> Since the most common use case is to import a full wiki, it
>>>>>> shouldn't
>>>> be
>>>>>> much of an issue.
>>>>>>
>>>>>>> * What does it mean when talking about exporting
>>>>>>> metadata ,direct
>>>>> access
>>>>>> to
>>>>>>> data through either a web service or database or file system
?
>>>>>>
>>>>>>
>>>>>> Some metadata can be conserved across systems. For instance, the
>>>>>> date
>>>>> when
>>>>>> the page was created, its edition date and its previous versions
>>>>>> might
>>>>> need
>>>>>> to be preserved (if that's technically feasible). Thus it
>>>>>> basically
>>>> means
>>>>>> taking care of all the information associated with the page
>>>>>> other than
>>>>> its
>>>>>> content.
>>>>>>
>>>>>>> Here my idea for the project , if I can have some feedback it
>>>>>>> would
>>>> be
>>>>>>> helpfull for me:
>>>>>>>
>>>>>>>     When exporting or importing data from a given wiki to a
>>>>>> destionation
>>>>>>> one
>>>>>>>      Setp 1: get rid of all specific synthax proper to the
>>>>>>> source
>>>>> wiki
>>>>>>> and retrieve data,metadata, and other usefull
information.This
>>>>>>> can
>>>> be
>>>>>>> achieved
>>>>>>>                 using a kind of parser whose job is to scan
the
>>>>> source
>>>>>>> page and reconize  the specific synthax and only retrieve
>>>>>>> proper
>>>>>>> data.Concerning encountered links ,we should
>>>>>>>                 convert theses pages as well but we have to
be
>>>>> carefull
>>>>>>> when cross linked ( for instance we are converting page A and
A
>>>> links
>>>>> to
>>>>>> B
>>>>>>> but when
>>>>>>>                 converting B ,B links to A).
>>>>>>
>>>>>>
>>>>>> You could start by looking at the XWiki 2.0 syntax and see
>>>>>> everything
>>>> it
>>>>>> allows. I think that when trying to convert pages from other
>>>>>> wikis
>>>>>> (specifically Confluence) you will run in the following issue:
>>>>>> some
>>>> pages
>>>>>> use macros that are defined elsewhere on the system and won't
>>>>>> work
>>>>>> correctly
>>>>>> when imported in XWiki.
>>>>>
>>>>>
>>>>> I already had a look at Xwiki 2.0 syntax.
>>>>>
>>>>>
>>>>>> For pure content, you should be able to import it all in XWiki
>>>>>> without
>>>>> much
>>>>>> of a problem. For content generated by a script, you could try
>>>>>> to
>>>>> identify
>>>>>> it and then issue warnings in your output such as "this is
>>>>>> specific
>>>>> content
>>>>>> that couldn't be converted".
>>>>>> See my answer above about retrieving the content of linked to
>>>>>> pages.
>>>>>
>>>>>
>>>>> I had a look at some of the previous threads in the mailling list
>>>> regarding
>>>>> import / export feature.
>>>>>
>>>>>
>>>>>>
>>>>>>>     Step 2: adopt a datacentric approach  to properly store
>>>>>>> data
>>>> in a
>>>>>>> such a way that is easy to retrieve them.We have to be
carefull
>>>>>>> when
>>>>>>> storing
>>>>>>> data
>>>>>>>                since they have to keep the original pages
>>>> structure.
>>>>>>
>>>>>>
>>>>>> Have you already looked at the content of a MediaWiki,
>>>>>> Confluence and
>>>>> XWiki
>>>>>> export file?
>>>>>
>>>>>
>>>>> Nop, I did not but I had a litle idea about the format since
>>>>> several
>>>>> parsers  ( XWikiParser , MediaWikiParser ...) are dealing with
>>>>> DOM.
>>>>> Where Can I get these differents exported files ? What is the
>>>>> usual case
>>>>> when getting this files ? Are we using any export utilities from
>>>>> the
>>>> source
>>>>> wiki in order to get an xml format file ? I will investigate on
>>>>> that for
>>>>> MediaWiki and Confluence.
>>>>>
>>>>> You can download Confluence from this page:
>>>>
>>>>
> 
 http://www.atlassian.com/software/confluence/ConfluenceDownloadCenter.jspato
 >>>> install it locally and play with
it a bit. You could specifically
>>>> give
>>>> a
>>>> look to 
http://confluence.atlassian.com/display/DOC/Confluence+to+XML
 >>>> .
>>>> Similar documentation is probably available from MediaWiki as well
>>>> but
>>>> you'll have to look it up by yourself ;-)
>>>>
>>>
>>> Thanks a lot I am having a look at MediaWIki Export format.I will
>>> setup
>>> Confluence as well.
>>>
>>>
>>>>
>>>>>> In XWiki's case, data is stored in a XML format. It might be
>>>>>> the same for Confluence & MediaWiki. If it is, you might be
able
>>>>>> to
>>>> use
>>>>>> XSLT
>>>>>> to convert one XML format to another.
>>>>>>
>>>>>
>>>>> I had a look at some of your previous discussion concerning the
>>>>> export /
>>>>> import feactures.
>>>>> If I properly understood:
>>>>>
>>>>> XWikiParser  => Transform XWiki format text into a DOM
>>>>> representation
>>>>> MediaWikiRenderer => Render a MediaWiki format text from a DOM
>>>>> representation
>>>>> MediaWikiParser => Transform MediaWiki format text into a DOM
>>>>> representation
>>>>> XWikiRenderer => Render a XWiki format text from a DOM
>>>>> representation
>>>>>
>>>>> Using the same idea it's  possible to do the same think for any
>>>>> other
>>>> wikis
>>>>> if we aware of this wiki's syntax.
>>>>>
>>>>> Within the wikiModel I found some references to
>>>>>   org.wikimodel.wem.xwiki.XWikiParser
>>>>>   org.wikimodel.wem.mediawiki.MediaWikiParser
>>>>>   org.wikimodel.wem.jspwiki.JspWikiParser
>>>>>   org.wikimodel.wem.creole.CreoleWikiParser
>>>>>
>>>>> Where can I find the source code for these elements ?
>>>>
>>>>
>>>> http://code.google.com/p/wikimodel/source/browse/#svn/trunk/
>>>> org.wikimodel.wem/src/main/java/org/wikimodel/wem/xwiki
>>>>
>>>
>>> Great , I just had a look at the source code and I am delighted to
>>> see that
>>> it's exactly what I have been doing during these last 3 month as
>>> part of Compiler Construction Code with professor *Hanspeter
>>> Mössenböck*.I
>>> wrote a compiler for a simplifier java looks like language.
>>> The idea is more or less the same here,
>>> I will give more details about how I plan to do it in my proposal.
>>>
>>>
>>>>
>>>>
>>>>> There were some issues concerning incompatible syntax between
>>>>> wikis in
>>>> the
>>>>> discussion.Specially
>>>>> issues concerning syntax that can exist in some wiki and does not
>>>>> exist
>>>> on
>>>>> other.(Example of confluence that is quite restrictive ,macro
>>>>> problem as
>>>>> refered by Guillaume). Are they any solutions found for this kind
>>>>> of
>>>> issues
>>>>> or should you just warn that some information will be ommited ?
>>>>
>>>>
>>>> I think that trying to convert everything is too idealistic. In
>>>> the case
>>>> of
>>>> the Office Importer, content that cannot be converted properly is
>>>> stripped
>>>> and warnings are issued for unsupported macros for instance. I'll
>>>> let
>>>> Asiri
>>>> tell you more about the Office Importer conversion behavior if
>>>> needed.
>>>>
>>>
>>> It could be helpfull for me if I could have some feedback from
>>> Asiri.
>>>
>>>
>>>>
>>>>> As far as I can see there are works already done for export /
>>>>> import
>>>>> feature
>>>>> so what's wrong with the existing work ? There are a lot of
>>>>> changes
>>>>> betweeen
>>>>> xWiki 1.0 and 2.0 syntax.I guess XWikiParser and XWikiRenderer
>>>>> has been
>>>>> modified according to these changes ?
>>>>
>>>>
>>>> XWiki 1.0 syntax was using the Radeox parser while XWiki 2.0
>>>> syntax is
>>>> using
>>>> WikiModel. You would definitely be working with WikiModel a lot,
>>>> improving
>>>> WikiModel's Confluence & MediaWiki syntax parsers so that they
can
>>>> eventually issue XDOM elements.
>>>
>>>
>>> alright
>>>
>>>
>>>>
>>>>
>>>>> To finish with my long list of questions ( sorry about that , I
>>>>> am just
>>>>> trying to understand the existing work), Can I have an use case
>>>>> for
>>>>> importing data from Confluence to xWiki ? ( from getting input
>>>>> data file
>>>> to
>>>>> expected result in xWiki).
>>>>
>>>> You can get an input file from the Confluence instance you will
>>>> have
>>>> installed on your machine. You can also give a look to XE 1.8
>>>> default XAR
>>>> (available to download from XWiki.org) to see what the expected
>>>> result
>>>> looks
>>>> like
>>>
>>>
>>> thank you Guillaume.
>>> Keerthan
>>>
>>> .
>>>> Guillaume
>>>>
>>>>
>>>>> Many thanks again for your answer.
>>>>>
>>>>> Best regards,
>>>>> Keerthan Muthurasa
>>>>> Msc Software Engineering,
>>>>> School of Technology,
>>>>> Oxford Brookes University
>>>>>
>>>>>>
>>>>>>>     Step 3: use previously retrieved data to create the
result
>>>> page
>>>>> in
>>>>>>> the destination wiki using the wiki specific synthax for
>>>>>>> destination
>>>>>> wiki.
>>>>>>
>>>>>>
>>>>>> See my answer above.
>>>>>>
>>>>>>> I am having a look at wikiModel that seems to contain a
>>>>>>> parser.I am
>>>>> also
>>>>>>> trying to understand Plexus.
>>>>>>>
>>>>>>> A Many thanks for your advices.
>>>>>>
>>>>>> Hope this helps,
>>>>>> Guillaume
>>>>>>  _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-devs] [Gsoc -- 2009] Questions concerning the Import Export from any other Wiki Project