[xwiki-dev] [Proposal] New Importer Architecture

Vincent Massol vincent at massol.net
Fri Apr 6 16:53:50 CEST 2007


On Apr 6, 2007, at 3:25 PM, Sergiu Dumitriu wrote:

>
>
> On 4/6/07, Vincent Massol <vincent at massol.net> wrote:
> Hi,
>
> I had some time in the train yesterday so I thought about what a new
> Importer architecture would look like.
>
> First the reason for changing the current one (which is located in
> the Package plugin):
>
> * "bad" design (everything is mixed up in one big class, not modular)
> and too complex to maintain
> * cannot import HTML, plain text, etc
> * cannot convert from one wiki syntax to another
>
> Proposal
> =======
>
>      * A Importer interface to represent the different importers
>            o import(Converter converter, DocumentImportFactory  
> factory)
>            o setFilter(ImportFilter filter) : to decide what document
> to import
>      * A Converter interface to convert original content before it's
> imported into a page
>            o OutputStream convert(InputStream originalContent)
>
> Maybe something to configure the converter? Can we make something  
> general enough, or do we let implementations to provide custom  
> methods? A general method is to provide plain get/set methods, like  
> for a hashmap.

Yes. Also required parameters will be passed in the constructor.  
Let's defer this till we start the implementation.

>
>      * A DocumentImportFactory interface for delegating how pages are
> created. This is important as there are different strategies for
> finding out the following data from the original content:
>            o Language
>            o Target Space
>            o Target Page name
>            o Objects to attach
>            o Attachments
>            o Versions
>            o Author
>            o API:
>                  + XWikiDocument createDocument(String
> originalFileName, InputStream contentAfterConversion)
>                  + setMode(REPLACE || APPEND): whether to create a
> new version or replace any existing doc
>
> I don't know if the name is good. DocumentImportFactory doesn't  
> sound like a factory that creates documents to me (maybe it's just  
> me), so I'd say that DocumentFactory is enough, if it resides in an  
> import package.

Yeah I'm not too sure about the method name. I'm fine with  
DocumentFactory.

>
> originalFileName reflects only the filename, or the complete path?

Yep good question. I was hesitating here. We do need the complete  
path for sure as one strategy is to use the parent directory as the  
target space name for example. I'm still not 100% clear what gets  
passed exactly. For example in the case of a Zip file, do we pass the  
relative path to the file inside the zip? Do we pass a full URL like  
path as in /some/path/my.zip!relative/path/some.file or simply path/ 
some.file? The former could possibly be useful as the name of the zip  
could maybe be used somewhere to compute a value. Same applies for  
Directory importers, etc. What's important is that a DocumentFactory  
implementation must be able to work regardless of the importer used.

Hmmm.... Thinking more about it I think passing a URL would be the best.


> Examples of implementations:
>
>      * For Importer: FileImporter, DirectoryImporter, ZIPImporter,
> ZipURLImporter, JARImporter
>      * For Converter: PlainTextConverter, HTMLConverter,
> TWikiConverter, ConfluenceConverter, XWikiXMLConverter (for
> converting documents in XWiki XML format)
>      * For DocumentImportFactory: XARDocumentImportFactory,
> ExpandedXARDocumentImportFactory, DefaultDocumentImportFactory (uses
> the file name as page name and parent directory as space, etc)
>
> Examples of using it
> ================
>
>      * A XAR file
>            o new ZipImporter(new File(".../.xar"), new
> XWikiXMLConverter(), new XARDocumentImportFactory(new File 
> (".../.xar")))
>      * A single HTML file
>            o new FileImporter(new File(".../.html"), new HTMLConverter
> (), new DefaultDocumentImporterFactory())
>      * A zip file containing TWiki pages
>            o new ZipImporter(new File(".../.zip"), new TWikiConverter
> (), new DefaultDocumentImporterFactory())
>      * An expanded directory of HTML files
>            o new DirectoryImporter(new File(".../somedir"), new
> HTMLConverter(), new DefaultDocumentImporterFactory())
>
> I've put all this on http://www.xwiki.org/xwiki/bin/view/Idea/
> NewImporterArchitecture but I think it's better to discuss it here as
> email is better for discussions...
>
> Note: I'm not planning to implement this yet as our first priority is
> still the 1.0 release but once it's released, I'm volunteering for
> implementing it, using a component strategy (cf new V2 architecture).
>
> WDYT?
>
>
> Sounds great.

cool

Thanks
-Vincent

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xwiki.org/pipermail/devs/attachments/20070406/711d1b2b/attachment.htm 


More information about the devs mailing list