Hi Edi,
On 10/30/2009 06:16 PM, Eduard Moraru wrote:
Hi Anca,
Here's my understanding of what you suggest:
Document text:
"word1 word2 word3 word4 word5 word6."
Annotation on "word4" results in:
- Selection: "word4" (unique within the context)
- Context: "word3 word4 word5" (unique within this document)
Then all you need to do to mark this annotation, is to locate the
document-unique context and then to mark the selection within the
document-area defined by the context.
I see 2 major issues here:
1. The selection must be itself unique within the context, otherwise you
just have the same big problem, at a smaller scale. This directly
restricts the context's size.
either use offset for position of selection in context, or use contextLeft &
contextRight (then the whole context becomes contextLeft + selection +
contextRight) and this doesn't affect context size.
2. How can the document-uniqueness of a context be
ensured?
- Fixed size context is not that practical.
Why?
It should work well in most real-life cases. My take is that a frame of 3-5
hundred characters doesn't repeat in a regular document created in a wiki (of
course you can build test cases when it fails).
- Computing the context size at creation time with
the js client
becomes a must, but if it fails to identify such an unique context, you
risc having all the document as context. Example document text: "word4
word4 word4 word4 word4 word4"
If the whole document is not megabytes (in which case you should be able to find
a shorter context), I don't see why that is a problem.
- The matching is done *after* the dynamic part of
the document
finishes to execute. That dynamic part could potentially generate a copy
of the context and confuse the matching algorithm.
Well, the whole point is to get the dynamic part to execute and store
annotations as text, so that annotations are defined by what the user sees, as a
general idea. This also ensures that an annotation could be displayed even if
its position moved from one execution to another (for example, in a scripted
document, you'd have a part which would only be displayed to admins. If an
annotation falls in that part, then for admins would match, for regular users
no, which is ok: since the text doesn't appear there's nothing the user expects
to be annotated. If an annotation falls in the part outside the content
displayed only for admins, then matching by text (and not position) would allow
us to find it and display it even if its position moves because of the text
rendered only for admins. If it's half-half then for admins it would be there
because the content is, for regular users not because the content isn't).
For the particular case of a dynamic part generating copies of context, I'm
reiterating the idea that a couple of hundred characters should work in normal
cases. If there is a dynamic part which duplicates the whole document, the only
problem would be that annotation would be matched and displayed on the first
encounter of its context, and not the second (is that a pb for what user sees
and perceives? or we could also display it 2 times) Also, this is a particular
case ("normal" documents shouldn't do that).
Of course there would be annotations which will fail to be matched and
represented but, if few enough and in particular enough cases, I think it's a
good tradeoff.
Note that I don't know yet how this performs in practice, but I think it's a
good direction to try as a balance between practical performance and speed of
implementation, and it's the only one so far (well, there's also the current
implementation but that only covers a subset of what this implementation would
cover).
Maybe I misunderstood the proposal or missed some key detail, otherwise,
please let me know.
The idea is that this algorithm is changeable in its key points, like detecting
and matching context. This first take is supposed to perform well *in practice*,
even if theoretically there are cases where it fails. If these failures make it
unusable or we decide we want atomic precision, then we can improve the key
points (like trying to match context better and others).
Thanks,
Anca
P.S.: I like examples :)
Thanks,
Eduard
On 10/30/2009 05:21 PM, Anca Luca wrote:
Hi devs,
following a discussion with Fabio about the second desired feature for the
annotations, namely the ability to add annotations on any document, no matter
how its content is generated, we came up with the solution described at
http://dev.xwiki.org/xwiki/bin/view/Design/AnnotationFeature#HSolution1stor…
, the main idea being that annotations would be defined by their selected text
and a context (as opposed to offsets) and would be identified to be rendered in
a document on a serialization of the transformed XDOM of the document, this way
taking into account any macro rendering, document inclusion, etc.
WDYT about this solution?
Also, because the implementation of this, though relatively localized, comes
together with refactor and cleanup of the annotations module (update everything
so that annotations don't store and use offsets anymore, remove classes&
functions which are not needed in this simplified process), I propose to include
this improvement in version 1.0 of the annotations module (so that we don't
cleanup and release what we know for sure we'll delete) and push the 1.0 version
further to mid to end December.
here's my +1 for this,
WDYT?
Happy coding,
Anca
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs