Hi Asiri,
On Oct 26, 2009, at 6:28 AM, Asiri Rathnayake wrote:
Hello Devs,
After few discussions I have revised the new officeimporter API to
take into
account the use of DocumentName instead of plain strings for
representing
document names. I'll repeat the details of the previous proposal
with the
new changes applied:
Currently we have the following officeimporter API:
<code>
OfficeImporter::importStream(InputStream is, String documentFormat,
String
targetDocumentName, Map params):void
OfficeImporter::importAttachment(String documentName, String
attachmentName,
Map params):String
</code>
Problems with this API:
* Loosely typed (params, document names)
* Both of the above methods perform almost the same task.
* Customizing the import process is implemented in a hackish way. (not
visisble on the API)
The new API proposed looks like below:
<code>
OfficeImporter::officeToXHTML(byte[] officeFileData, DocumentName
referenceDocument, boolean filterStyles):XHTMLOfficeDocument
OfficeImporter::xhtmlToXDOM(XHTMLOfficeDocument
xhtmlOfficeDocument):XDOMOfficeDocument
OfficeImporter::officeToXDOM(byte[] officeFileData, DocumentName
referenceDocument, boolean filterStyles):XDOMOfficeDocument
OfficeImporter::buildPresentation(byte[]
officeFileData):XDOMOfficeDocument
OfficeImporter::splitImport(XDOMOfficeDocument xdomOfficeDocument,
int[]
headingLevelsToSplit, NamingCriterion namingCriterion, DocumentName
baseDocumentName):Map<TargetPageDescriptor, XDOMOfficeDocument>
</code>
I don't like too much this API because it mixes several things that
are different.
All the To methods seem to be of the domain of the conversion to me
and are not related to having a connected openoffice server running
and not related to having documents. For me they should be in a
Converter interface.
This would allow to use them in various contexts.
So I'd see 2 interfaces at the top level:
- OfficeConverter: no relation with a running OO server or with the
XWiki Model
- OfficeImporter: connect to the running OO, get the data, use the
OfficeConverter to perform conversion, knows about XWiki Model to save
the result in Wiki pages.
+ the notion of Transformation (or Split) to split a
XDOMOfficeDocument into several.
In OfficeImporter I'd see only 1 method:
import(Source (whatever object you use to represent the filename to
import), Target (whatever object you use to represent the target
location))
And in Target I'd add the possibility to pass a Transformation or
maybe simply have a SplittingTarget that extends Target and adds
splitting.
WDYT?
Thanks
-Vincent
As you can see, these methods are more granular and
the
responsibilities are
well defined. Customizing the import process can be done from the
client
code. For an example:
1. Make the initial import from office to XHTMLOfficeDocument -
OfficeImporter::officeToXHTML()
2. Perform customizations on the XHTMLOfficeDocument - w3c DOM
manipulations.
3. Import the XHTMLOfficeDocument into XDOMOfficeDocument -
OfficeImporter::xhtmlToXDOM()
4. Perform customizations on the XDOMOfficeDocument (XDOM) - XDOM
manipulations.
5. Split the XDOMOfficeDocument into multiple XDOMOfficeDocument
instances -
OfficeImporter::splitImport()
6. Perform customizations on these child XDOMOfficeDocument
instances - XDOM
manipulations.
7. Render the XDOMOfficeDocument instances & save them into wiki
pages -
XWiki rendering operations.
I think this interface will make it easy to extend & maintain
officeimporter
functionality in the future.
Along with this, I would also like to refactor the xwiki-refactoring
module
a bit to get rid of string based document names from it.
This whole refactoring operation would take approximately one day to
complete. And since this operation is not adding any new features, I
think
it can be committed on both trunk and 2.0 branch.
Here's my +1 to all of above.
Thanks.
- Asiri