I am currently adding some enhancements to the Lucene plugin for
Curriki, and have a few questions.
I have submitted a patch for XPLUCENE-25 in order to not index
password fields, the patch seems to have been accepted by Sergiu (with
some adjustments) and committed.
The other patch (XPLUCENE-26) is to allow better sorting of results.
The issue is that when sorting by a field that has been tokenzied by
Lucene the sorting is by any of the tokens (seemingly random), so
titles for example are sorted by random words within the title. The
patch that I have created so far increases the size of the index by
about 50% though (indexing non-tokenized versions of each object
field) and I am not sure if that is acceptable to the XWiki community
at large.
One item that I noted was that the object data is being stored in the
index, but there does not seem to be anything in the SearchResult
interface that allows for getting the values back. Is there a reason
the data is stored? I see two options here, first would be to add a
method in SearchResult that lets one get the object data out (but that
ends up having security issues for pages that one would not normally
be able to see), the other would be to just index the data and not
store it (which should reduce the index size). Any thoughts on the
best direction here?
The last question I have is how do I create a string array (String[])
in a velocity script so that I can have a secondary sort column?
Velocity seems to create object arrays but the LucenePluginApi
requires a string array for the sortField argument of
getSearchResults.
Any comments/input/suggestions/answers are welcome.
David
--