About the "confusing" results: of course I can do some velocity- or groovy
-post-processing on the page to filter out the duplicate document names from results. This
is no problem if I want to show the results as one flat list but it becomes quite
complicated to do efficiently if I want to show f.ex. 30 hits per page with paging.
Basically I mean that all the index data holded by one page (page content, objects,
attachments...) should be regarded as this page's properties and indexed under one
lucenedoc. If I operationally have understood something incorrectly, please correct me.
What do you think?
-Petteri
I'm testing the version RC1 and have some thoughts about the Lucene-plugin:
Sorting
-------
At the moment there is no method in the API to choose between the sorting directions. I
think there should be one.
Boolean Queries
---------------
Is the default query-type going to stay an ORed one? Should it better be an AND-query,
like Google? I think most of the surfers expect that nowadays. At least there could be
method to choose between them.
Results
-------
Search results are a bit confusing now. When I search for a word that appears both in a
document, in it's attachment and in it's object's properties I get three hits.
One for each type. There a special cases, of course, but I think it would be more usable
to get in my example case by default just one hit. There would still remain a method
("type: object") to distinguish between the types in a query.
If you like, I can provide patches for the two first (minor) issues.
Regards,
Petteri