On 06/10/2013 11:00 AM, Thomas Mortagne wrote:
Hi devs,
Right now the XAR plugin format goal systematically empty the
<defaultLanguage> property.
This is wrong IMO since it means we have no idea what is the default
document language, it was not too visible before but it's really not
very nice for things like the localization module and especially SOLR
which store deferently the content depending on the language (stop
words, etc).
I see several possibilities:
1) We don't touch the XAR maven plugin and we state that when default
language is not set, it's en (in the importer for example or in
XWikiDocument#getDefaultLanguage)
2) We stop filtering default language in the XAR plugin and we set it
to en for all document in which it make sense
3) We force default language to "en" in the XAR plugin
WDYT ?
I don't like too much 1) since some technical document could really be
seen has having no default language, some document without any literal
content. But it's more a -0 than a -1, I understand other would want
this for simplicity.
About 3) as I said having a default language empty is a valid use case
IMO so -0 for this one to. Still a bit better than 1) since the use
case is still possible.
+1 for 2)
Neither option is good in general. The main problem is that most
documents are written in the "Velocity" language, not in the
"English"
language, meaning that it only contains code (which won't be seen by the
user), and translations, which depend on a lot of factors. It's not good
to say that the default language of a dynamically translated document is
en, since a wiki configured with a different language will only display
them in that language, never in en.
There are only a few documents that contain real text (normally only the
sandbox should have real text, everything else should be localized), and
for those it's OK to specify the actual language.
Other options:
4) Detect somehow localized documents and index:
- the raw content using a non-language-specific analyzer
- the content translated into all the languages registered in the
administration, each with the proper language-specific analyzer, if they
are supported by Solr; this includes the default wiki language.
4a) localized document = the default language is empty
4b) localized document = the default language is literally "localized"
4c) add another document flag for marking localized documents
5) When the defaultLanguage is empty, render in the configured wiki
default language
I like 4) since it makes localized documents really searchable in all
the languages "supported" by that wiki instance.
4a) is a behavior change, so it might cause some trouble
4b) is the safest and requires the least amount of changes
The number of document fields is increasing, so I'm not that fond of 4c)
--
Sergiu Dumitriu
http://purl.org/net/sergiu