+1 for utf-8 only.
Ok to wait till 5.0 unless there are good migrations.
To be able to answer I need to understand more. For
example what currently doesn't work with any encoding the user wants to use?
Shouldn't we just be transparent and use whatever encoding is specified and not
hardcode anything?
Unfortunately, this has been a dream that never comes true that has never been
sufficiently tested.
I know for sure that both Solr and XWiki will remain imperfect at some edge cases which
few people test if their application-level-configured encoding differs from the
platform-encoding.
XWiki may well have thought well about this since its developers pay attention to
deprecations often, there's a zillion remaining other places where this is not done as
cleanly.
Sergiu, I understand your proposal to include a validation of the environment at startup,
is that correct? Including a validation of the DB abilities?
paul
We've
moved more and more toward an UTF-8-only application, and XWiki
has only been tested with this configuration for several years.
I propose that we require UTF-8 for a valid, supported installation.
This means:
- JVM encoding (-Dfile.encoding=UTF8)
- Container default URL encoding (Tomcat has ISO-8859-1 by default)
- Database encoding (MySql is still configured with latin1 on some distros)
There's one big site to update on our side:
xwiki.org.
Here's my +1. This is a move toward a future web, since more and more
standards require (or at least assume as a default) UTF-8.
After thinking a bit more, it would make sense to require a valid
Unicode encoding, including UTF-16, which is preferable in countries
that don't use a latin alphabet. However, XWiki doesn't currently work
under 16-bit encodings at all.
For XWiki 4.x I'm -1 since it's a big change and we don't want to break our
users that currently use 4.x with ISO8859-1 for example
For XWiki 5.x I'm not sure.