Dear Savitha,
Dear XWiki community,
that I know of, there's two major flaws in the current lucene plugin:
- It stores and indexes everything which makes it a big memory eater. This will be fixed
by Savitha using Solr's schema.xml and hopefully other admin-configurred classes.
- Each of the search results' list has to be skimmed through so that the count only
covers documents one has access to (this is done in SearchResults.java in
getRelevantResults). This has the direct consequence that a search for all documents
basically goes through all documents which is quite annoying.
In general, the practice of going through many documents, one could say the practice of
pre-processing the search-results' list, is a catastrophe. There are very many times
when a user inputs a query that matches way too many documents.
That means also that Savitha should avoid this skimming in her SOLR module and this needs
some skills and probably some help:
- the skills to understand completely the rights model. As far as I know it is based on
XWikiRights objects in each document and can talk about users (a list of users) and
groups. but this needs to be deeply observed and asked many times about.
- the skills to map this model into something that is executable by Solr/Lucene queries.
In Curriki or i2geo and in many other specific applications, this is much easier because
the rights model is simpler (owner is defined, only three rights possible). But this has
to be done in a generic way and might include the requirement to reindex a large part of
documents if a user joins or leaves a group. I am thinking this can be implemented:
include fields such as "prohibitedFor" "prohibitedForGroup",
"allowedFor", "allowedForGroup" and use the current users'
identity and groups when querying. I note that it is important to care for the user that
requests the documents when indexing as well (which probably needs to be admin).
Savitha, I think this is the hardest part of your project. Are you up to it?
paul