Hi Ludovic and Paul,
On Sun, Jul 1, 2012 at 2:33 AM, Paul Libbrecht <paul(a)hoplahup.net> wrote:
Ah, feedback! This is really good.
Le 30 juin 2012 à 20:26, Ludovic Dubost a écrit :
This is nice progress. I've had a look and I
have a few remarks:
1/ Some weird results
It seems the results are not always ok. For instance this page
http://ec2-50-19-181-163.compute-1.amazonaws.com:8080/xwiki/bin/view/Search…
comes up if I search for "SearchTest"
but it does not come up for "liste"
Also these 2 searches says 6 and 1 results and show only 3 and 0 results.
This is due to the multi-lingual document (one document in four languages).
The multilinguality is, I think, on top of Savitha's priority.
The embedded solr server running on the machine doesnt have ability to
handle multiple languages at this time. So it's returnng weird results. I
am working on it. Yes, paul its on my top priority and researching more
about it.
2/ Avanced queries
I was also wondering if we can use advanced queries.
I've been trying
SearchTest +space:SearchTest
and this does not seem to work.
There's a good reason for this: the syntax for search currently in use is
"Dismax". This is a query-parser that is rather less technical, so it
avoids such issues as considering an apostrophe as a separator (an issue
that was reported).
The queries you are suggesting, which I think can be useful, only work
with the Lucene Query-Parser, and not with dismax. This will be
configurable but I am not sure which one should be the default.
It is possible and the parser am using is Extended Dismax and it supports
+/- and fielded queries. To support multilingual content indexing am using
a different set of fields. For english [ title_en, space_en, fulltext_en
...], French [ title_fr, space_fr, fulltext_fr ..], now when the user
searches for space:SearchText it may not work. To support this I should do
a bit of preprocessing to the query before calling solr server. I am
working on this too.
3/ It's important that we end up with at
least the same features as in
lucene.
Mmmh, not *all* of the features.
E.g. that all fields are stored is really not desired (and almost never
used in search results).
For instance being able to query all the fields
we could query
in lucene is important. For instance:
object:XWiki.XWikiUsers
should return only users
Something of this sort will be needed to achieve the advanced search
scenario.
Ordering and Scoring is also something that
existed in lucene. How
would this work in SOLR ?
A score is already displayed currently.
4/ we also want of course the advantages of SOLR,
which means
facetting. Tags, Spaces, Wikis can be interesting facets
The reason multilingual documents have been a problem thus far is that
Savitha is also trying to make the language a facet which is really
interesting but is raising an amount of difficulties.
Yes, am currently considering Facets. To start with we can have Spaces and
Wikis. The new DocumentModelBridge and DocumentAccessBridge has some
missing API to access author, created date, modified date which can be
other interesting facets.
5/ in terms of multilingual search (in case of a multilingual wiki) we
need to make sure that you can say that you make
a search in a
specific language and the correct stemmer is used (if stemming is used
at indexing time we need to index the content in each language with
the correct stemmer). I saw that you did some things with languages so
maybe SOLR has also other ways to handle this.
If you look into the source, you can see some of that.
Solr can do this very nicely declaratively with the schema.xml and
solrconfig.xml.
Part of Savitha's intent was to offer an adminstrative UI to manipulate
this but I'd personally prefer editing files manually. Or maybe we even
have to invent an extended schema syntax for XWiki-Solr (thus indicating
that a field of solr, of this and that type, tokenization and storage, if
fed by a property x/yz of an xwiki document).
As a start am going to keep it simple by editing the schema files
manually.
paul
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs
--
best regards,
Savitha.s