Paul Libbrecht wrote:
Le 18 janv. 08 à 00:05, Your XEN ICT Team - Ricardo Rodriguez a écrit :
1 - I
already have a good load of content, can I switch to UTF-8
without wrecking the wiki ?
As far as I understand, all involved actors must run
the same encoding.
It is just a guess, but if your database and your XWiki run with
different encoding, URLs with characters like á, ñ, or / won't work.
I haven't tried (and really do not understand why there's anyone on
earth sticking with iso-8859-1 (except for wrongly configured editors
such emacs) but:
I know that the DB-installation instruction say that the DB should be
configured for encoding unicode, so that the java-strings <-> DB is
encoding proof for any encoding.
XML import is also bullet proof there: the encoding value is written in
the header.
Where you change encoding is at the front-end (delivery, hence input)
and there, I think you can change this right away, the re-encoding
happens in java in all cases, to any encoding.
My 2p.
paul
Hi,
I agree with you that 8859-1 is not the best encoding for a default, and
that UTF-8 should be the default if we want to see XWiki used in
non-USA/Western Europe countries.
However, changing the encoding is not as simple as you say. The most
important things, external to XWiki, are:
- By default, some databases are configured in latin1 (like mySql).
Setting the database to utf8 is something the system administrator must
do, as we cannot specify it from XWiki. We could specify the encoding of
each table, IF hibernate supported this. But it does not, so not even
that is possible.
- The JVM uses the system encoding, if not manually specified. And
almost all operating systems are configured for 8859-1, maybe except
those that can be found in Asian countries. And all strings written are
by default converted to the system encoding, unless we override the
default. And changing the default JVM encoding, again, is something we
cannot do. We have an issue for overriding the initial encoding from the
platform with the one specified in xwiki.cfg or web.xml, but it is not
implemented yet. (I did it locally, but the code wasn't nice, so I had
to postpone this fix until I find a better solution).
We are planning to switch to UTF by default, but there are several
changes needed in the platform. And more important, we need a lot of
testing to make sure that nothing gets broken by this change. Chances
are it will work, as I have setup several sites using UTF-8, and nobody
reported problems until now. (There were several problems, but they have
been fixed already).
Sergiu