Re: [xwiki-devs] [GSOC] Basic Solr Implementation

2 Jul 2012

Hi,

looks very cool!

Indeed, I guess stemming still needs some improvements. For instance
searching for "embeter":

http://ec2-50-19-181-163.compute-1.amazonaws.com:8080/xwiki/bin/view/Search…

did not return any results although searching for "embêter":

http://ec2-50-19-181-163.compute-1.amazonaws.com:8080/xwiki/bin/view/Search…

returns 2 results whereas both should be the same.

Keep up the good work!

Guillaume

On Sat, Jun 30, 2012 at 11:03 PM, Paul Libbrecht &lt;paul(a)hoplahup.net&gt; wrote:

...
  Ah, feedback! This is really good.

 Le 30 juin 2012 à 20:26, Ludovic Dubost a écrit :
  This is nice progress. I've had a look and I
have a few remarks:

 1/ Some weird results

 It seems the results are not always ok. For instance this page

http://ec2-50-19-181-163.compute-1.amazonaws.com:8080/xwiki/bin/view/Search…
  comes up if I search for "SearchTest"
but it does not come up for "liste"
 Also these 2 searches says 6 and 1 results and show only 3 and 0 results. 
 This is due to the multi-lingual document (one document in four languages).
 The multilinguality is, I think, on top of Savitha's priority.

  2/ Avanced queries

 I was also wondering if we can use advanced queries.
 I've been trying

 SearchTest +space:SearchTest

 and this does not seem to work. 
 There's a good reason for this: the syntax for search currently in use is
 "Dismax". This is a query-parser that is rather less technical, so it
 avoids such issues as considering an apostrophe as a separator (an issue
 that was reported).

 The queries you are suggesting, which I think can be useful, only work
 with the Lucene Query-Parser, and not with dismax. This will be
 configurable but I am not sure which one should be the default.

  3/ It's important that we end up with at
least the same features as in
 lucene. 
 Mmmh, not *all* of the features.
 E.g. that all fields are stored is really not desired (and almost never
 used in search results).

  For instance being able to query all the fields
we could query
 in lucene is important. For instance:
   object:XWiki.XWikiUsers
 should return only users 
 Something of this sort will be needed to achieve the advanced search
 scenario.

  Ordering and Scoring is also something that
existed in lucene. How
 would this work in SOLR ? 
 A score is already displayed currently.

  4/ we also want of course the advantages of SOLR,
which means
 facetting. Tags, Spaces, Wikis can be interesting facets 
 The reason multilingual documents have been a problem thus far is that
 Savitha is also trying to make the language a facet which is really
 interesting but is raising an amount of difficulties.

  5/ in terms of multilingual search (in case of a
multilingual wiki) we
 need to make sure that you can say that you make a search in a
 specific language and the correct stemmer is used (if stemming is used
 at indexing time we need to index the content in each language with
 the correct stemmer). I saw that you did some things with languages so
 maybe SOLR has also other ways to handle this. 
 If you look into the source, you can see some of that.
 Solr can do this very nicely declaratively with the schema.xml and
 solrconfig.xml.

 Part of Savitha's intent was to offer an adminstrative UI to manipulate
 this but I'd personally prefer editing files manually. Or maybe we even
 have to invent an extended schema syntax for XWiki-Solr (thus indicating
 that a field of solr, of this and that type, tokenization and storage, if
 fed by a property x/yz of an xwiki document).

 paul
 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-devs] [GSOC] Basic Solr Implementation