On Feb 15, 2008, at 2:49 PM, Sergiu Dumitriu wrote:
Hi devs,
We need to decide how to handle the charset/encoding in XWiki. We
have 3
options:
1. Leave it as it is. The default is ISO-8859-1, and the admin has to
make sure the JVM is started with the correct -Dfile.encoding param.
If
another encoding is needed, it has to be changed in 4 places (web.xml,
xwiki.cfg, -Dfile.encoding, database charset+collation)
2. Force it to always be UTF-8, overriding the file.enconding setting.
This ensures internationalization, as UTF-8 works with any language.
And
I think it is a safe move, as any modern system supports UTF-8 (given
that XWiki requires java 5, we can assume it will be in a modern
system). This has the advantage that the code will be simpler, as we
don't have to check and switch encodings, but has the disadvantage
that
mysql has to be manually configured for UTF-8, as by default it
comes in
latin1.
Isn't this a problem with databases which are configured in ISO8859-1
by default most of the time?
Same question for the servlet container.
I can't vote till I know the answer to these 2 questions.
Thanks
-Vincent
PS: As a principle I don't like hard-coding anything so if these
questions are answered satisfactorily I'll be ok but with a single
config parameter set to UTF8 by default in xwiki.cfg.
3. Keep it configurable, but by only specifying it in
one place
(xwiki.cfg or web.xml), and enforcing that encoding in the JVM (by
overriding file.encoding). The default should be UTF-8.
Here's my +1 for option 2, -1 for option 1, and 0 for option 3.
--
Sergiu Dumitriu
http://purl.org/net/sergiu/