[xwiki-dev] Difference Engine Refactoring and Improvements
Vincent Massol
vincent at massol.net
Wed May 2 10:34:24 CEST 2007
On May 2, 2007, at 10:13 AM, Ludovic Dubost wrote:
[snip]
>>> Since there is no generic diff for objects, I'd like to write a
>>> Diff plugin allowing to make a nice diff of any two strings passed.
>>
>> Are you talking about Object diffs or String diffs here? Or do you
>> mean XML diff?
>>
>> I think we have 2 options:
>> - Diff of Objects: Difference getDifferences(Object o1, Object
>> o2). I guess the diff could then be the difference of object
>> fields. This would need to be implemented for each XWiki Object.
>> - XML difference. This is the XML representation of XWiki Objects.
>> I don't think there are any good/simple XML diff frameworks so we
>> would also need to implement that.
>>
>> I think it's better to do an Object diff as otherwise the xml diff
>> would need to be transformed to be presented to the user and this
>> will require extra parsing. Better to operate on Object as we
>> already have them in our java code.
> I'm talking about Object Diff.. We already have a diff of objects
> but it was not differentiating the text inside the fields. Now it
> is doing this..
ok we're in line then. I thought you were talking about some text
diff because the api belows only takes strings.
>>
>>> At the same time I'd like to start a refactoring of the current
>>> diff in the same plugin.
>>
>>> Currently I see the following APIs:
>>>
>>> DiffPlugin
>>> // returns a list of org.suigeneris.jrcs.diff.Delta (which
>>> representd differences)
>>> getLineDiffAsList(String content1, String content2)
>>> // returns a list of org.suigeneris.jrcs.diff.Delta (which
>>> representd differences)
>>> getWordDiffAsList(String content1, String content2)
>>>
>>> // returns an HTML view of differences
>>> getLineDiffAsList(String content1, String content2)
>>> // returns an HTML view of differences
>>> getWordDiffAsList(String content1, String content2)
>>>
>>> // returns an Text view of differences
>>> getLineDiffAsList(String content1, String content2)
>>> // returns an Text view of differences
>>> getWordDiffAsList(String content1, String content2)
>>>
>>
>> I don't understand. I would have used something like:
>>
>> List<Difference> getDifferences(XWikiDocument, XWikiDocument)
>> List<Difference> getDifferences(XObject, XObject)
>> List<Difference> getDifferences(String, String)
> Ok.. I can look at changing these APIs. However the return of
> getDifferences(XWikiDocument,XWikiDocument) or getDifferences
> (XObject, XObject) can be quite complex in terms of Java structure.
Without thinking a lot about it, I would see something like this for
the Difference object:
- Context information (String). This would be the property for
example when comparing 2 objects. It could be "page" when comparing
the content of a page, etc.
- Old value (String).
- New value (String).
- Location: some information where the change appears, possibly also
the text surrounding the difference, etc
I think we need both StringDifference and ObjectDifference which both
implement Difference, so getDifferences(XWikiDocument, XWikiDocument)
would return a list of both, getDifferences(XObject, XObject) would
return a list of ObjectDifference and getDifferences(String, String)
a list of StringDifference.
> Currently we have similar functions in XWikiDocument
> (getObjectDiff, getMetaDataDiff, getContentDiff). I'm not
> completely sure we should move them to the DiffPlugin.
>
> I have a first prototype of the DiffPlugin (with only strings API)
> and with that I was able to do a complete diff page for a document
> (including Objects and MetaData)
>
> Check http://jira.xwiki.org/jira/browse/XWIKI-1162 for the
> DiffPlugin patch..
I will... I'm trying to focus on the WYSIWYG editor today and the RC4
release but I'll try to find some time.
>>
>> Also, I think it's critical that in an API we should only use our
>> own classes/interfaces and no external ones, so I think we should
>> have our own Difference class and not use JRCS'. It could possibly
>> wrap it if necessary.
>>
> The wrapping would require quite a lot of cloning work. The result
> of the JRCS engine is a Delta object which contains a list of
> Chunks which contains a list of Strings. These objects are quite
> plain.
> I would not see what to do except completely cloning them and write
> a copy function from JRCS to XWiki.
What about the Difference object as summarily described above? If
internally we have JRCS objects, we could always construct our own
Difference object, no? I think it's really much safer from an API
point of view as we're not convinced we'll stay with JRCS I think.
Also (and even more importantly) the interface is meant so that
someone else can implement a different diff algorithm. I'm not sure
it would be good to force them to use JRCS objects, especially as we
don't control JRCS.
>>> Other APIs could be a function to get a complete diff of an
>>> XWikiDocument (includes objects, attachements), however the
>>> implementation itself should probably reside in an velocity
>>> template.
>>
>> The implementation of a document difference should in the plugin I
>> think. The plugin should only do backend stuff though and return a
>> list of named differences. I agree it would be up to the vm files
>> to do the presentation of it (be if for the wiki, for an email to
>> be sent, etc).
>>
> Ok.. That will deprecate a few functions in XWikiDocument. So the
> plugin functions should set a few objects un the context
> representing the diff. The template would then present these
> different diffs in a nice way.
I haven't looked at the current API. I guess we could keep a
XWikiDocument getDifferences(XWikiDocument) api in XWikiDocument. It
would use the Diff plugin.
>>> There is an interesting discussion to have about how the
>>> representation of the Text and HTML views should be.
>>
>> Yes, that's hard. I'd like to see a wiki markup diff in addition
>> to the current HTML diff we have as I find our current diff not
>> very good. I think we need both.
>>
> I don't think you understood exactly what I mean by Text and HTML
> view. They are both Wiki markup diff but rendered in Text or HTML.
> A HTML-Diff is more complex as there are risks of failing to
> generate a valid markup.
> I think we should stick to Wiki Markup diff.
We're talking about the same thing. I also prefer a wiki markup diff.
I was under the impression that the current diff representation was a
rendered version of the textual diff.
-Vincent
>>> Any ideas ?
>>>
>>> Another question is wether it is a good idea to put this as a
>>> plugin. I think yes since it could be use for other things than
>>> the wiki content.
>>
>> A plugin would be good I think as it means the implementation
>> becomes pluggable. In the future it would be transformed into a
>> component but that's the same idea.
>>
>> Thanks
>> -Vincent
More information about the devs
mailing list