On 02/01/2010 12:39 AM, Paul Libbrecht wrote:
Sergiu,
I've done a LuceneIndexProfile that's quite different from your
description:
http://svn.activemath.org/intergeo/Platform/i2gCurriki/plugins/lucene/src/m…
The idea of this interface is that it is given to the IndexUpdater and
data-objects so that they query how to best index, how to best ignore
documents or fields (important for memory usage limitation), and maybe
for custom indexing strategy.
I looked a bit at the code, and it is very focused on your needs.
Personally, I'd suggest that you go ahead and use it, since it will be a
while until we can implement something more generic, and it definitely
won't work with a 1.5 core (probably not even with a 2.2 core).
I am not sure I grasp your cascade proposal. Which
steps would it
involve?
In your approach, there's only one implementation that handles the
decisions. In my approach, there can be several implementations, each
one handling its part. This means that you can write a custom
implementation for i2geo, but it won't be suited for a bigger
environment, where several different applications are installed and each
one would benefit from customizing the index. So, each application could
also come with its own indexing rules: the blog, the calendar, the user
management...
Cascading these filters ensures that there can be several cross-cutting
indexing rules that apply to the same entity,
Sure I could migrate this strategy to "abstract
entities" as you
describe them but I am making this for
i2geo.net which runs the
curriki 1.8 branch which uses xwiki 1.5.4... so i need to be somewhat
long backwards compatible.
thanks
paul
PS: when I said central-field I meant an XWikiPrefererence field
similar to the Notification field: a reference to a groovy page
containing a class implementing the given interface. I would love to
use this for the lucene-index-profile custom class (for now I'd just
need to call the setter within the first-called-page).
Le 31-janv.-10 à 20:32, Sergiu Dumitriu a écrit :
>> Probably the nicest way I see this would be the way the notifications
>> are done: a central field indicates the page of a groovy source which
>> should implement such an interface as "LuceneIndexProfile" which
>> would
>> add such questions (maybe even including some more such as the Data
>> classes).
> I'm not sure I understood your approach, could you explain it in more
> detail? What do you mean by "central field"?
>
>
> The way I see it, each indexed field will have a reference, given by
> some coordinates (this is related to the thread about object and
> properties references), such as
> "wiki:Space.Document^classname[index].property". There should be a
> collection of filters (components implementing LuceneIndexFilter)
> which
> have the following method:
>
> boolean filter(Reference entity, LuceneIndexProfile profile);
>
> The meaning is the following:
> - entity is the entity to process (could be a document, an object
> property, an attachment)
> - profile is the indexing profile built by the filters, initialized
> with
> some default values in the Lucene Plugin, and modified by the
> filters as
> it passes through them
> - returning true means that the filtering process should stop, since
> the
> current filter decided that the profile is ready (for example if a
> filter decided that the document should not be indexed due to security
> restrictions, then it's useless to run all the other filters); by
> default filters return false, letting the other filters to adjust the
> profile
> - each filter looks at the reference and, based on some internal
> rules,
> decides if it should alter the filter for this entity, and if it
> considers that no more filtering is useful/needed
>
> After the filtering is done, the plugin indexes (or not) the entity
> according to the values in the profile.
>
> This means that we could have several components affecting the Lucene
> behavior, each one with particular goals in mind (security,
> performance,
> searchability), and each one with its own configuration.
>
>
> So, what needs to be done (except writing the code) is define the
> possible settings in the LuceneIndexProfile, define the filters
> needed,
> decide how to configure them. XML files on the server are an option,
> but
> one not flexible enough. Maybe objects inside the wiki will give more
> flexibility to application developers. So, another thing to do is
> decide
> the fields needed in such a class.
>
> Of course, if somebody needs a new filter, it's easy to add a new
> jar or
> write a new Groovy page in the wiki.
--
Sergiu Dumitriu
http://purl.org/net/sergiu/