On 02/17/2011 03:56 PM, Vincent Massol wrote:
Is it this issue:
http://jira.xwiki.org/jira/browse/XWIKI-5976
?
If so it's fixed in 3.0M3
No, that's related to Hibernate. The problem is in the indexer, it
analyzes all the document fields. The default analyzer processes the
text in different ways:
- splits into tokens
- removes apostrophes and other punctuation
- *removes common words* (called "stop words")
- stems words
- transforms to lowercase
Since "it" is a common (stop) word, this actually means that a document
coming from the "it" wiki will actually have an empty "wiki:" field.
The solution appears to be simple, just stop analyzing special tokens:
wiki, language, space, creator...
In reality it is a lot more difficult, since even if the indexed data
will be correct, the default query parser runs the query through the
same analyzer, and it doesn't know not to process tokens with an
explicit field. This means that searching for "wiki:it" will actually
remove this search token from the query. Searching for the
"space:Apples" will actually try to match an "apple" token against the
"Apples" index, so it won't give the right results.
I've stopped working on this for the moment, if someone else wants to
pick up the remaining work (writing a more intelligent query parser that
knows which fields should be analyzed and which not), I can help, but
it's not a priority for 3.0 for me.
A quick fix is possible for the wiki field, since it is not passed to
the query parser, but is manually added to the query, unanalyzed.
Thanks
-Vincent
On Feb 16, 2011, at 12:23 AM, Ludovic Dubost wrote:
>
> We have found some issues with the analyzer code that analyzes the wiki name.
> Though with the english analyzer this should not be a problem.
> We are fixing this for the next versions of XWiki.
>
> Now if you are sure the problem is the name of the wiki, rename it and use a wiki
"alias".
>
> The result for the user will be the same URL, but the wiki will have internally
another name
>
> Ludovic
>
> Le 04/02/11 11:15, Tronicek a écrit :
>> Hi,
>> we've updatet XWiki 1.7 XE to 2.7.33656 and are using the Wiki Manager to
>> have a Wiki Farm.
>>
>> There is a strange behaviour we have not realized immediately related to
>> search requests.
>> It seams that the name of the virtual wiki is causing the problem. Its name
>> is "it" and is used as solution base for IT problems.
>>
>> We can reproduce the problem by:
>> - create a new virtual wiki with name "it" (without quotation marks).
>> - import xwiki-enterprise-wiki-2.7.xar
>> - search with lucene (no results):
>> .../view/Main/LuceneSearch?text=sandbox&space=
>> - search with old engine (see pages):
>> .../view/Main/WebSearch?text=sandbox&space=
>>
>> We tried to change the analyzer in xwiki.cfg:
>> xwiki.plugins.lucene.analyzer=org.apache.lucene.analysis.de.GermanAnalyzer
>> -> no success
>>
>> Our virtual wikis are mapped via virtual path (xwiki.cfg:
>> xwiki.virtual.usepath=1).
>>
>> It would be nice to keep the virtual wiki name. Is there a workaround to
>> handle this problem?
>>
>> Regards,
>> Rudolf
--
Sergiu Dumitriu
http://purl.org/net/sergiu/