Hi Zeljko,
On Apr 7, 2007, at 7:49 PM, Zeljko Trogrlic wrote:
[snip]
3) I see that
in our standalone installation we use -
Dfile.encoding=iso-8859-1. Now that I've read Joel's tutorial it
seems to me this is not going to work for everyone and that we
should rather use -Dfile.encoding=UTF-8 by default. WDYT?
That is problem if it's not your default encoding. You have two
options:
* use platform default encoding and don't use non-ASCII characters
in default configuration
* use UTF-8
Although UTF-8 sounds better, note that you:
* need an editor that supports it, otherwise local encoding will
creep in
* encoding must be set manually because encoding can't be detected
for plain text files
* you have to communicate this very clearly to users
* text will look funny in non-UTF-8 editor and it will be hard to
change it
Let's look at the files xwiki manipulates:
- config files. These ones should only contain ASCII characters and
unicode code points when there's a need as with resource bundles for
example. Thus all encoding will work there.
- XAR files. If these are created with XWiki (with an export) they'll
use the file.encoding specified so if it's utf8 they'll be saved in
utf8. In addition, I propose that in our build we run native2ascii
for all our data files (including the XAR files). This can be done
automatically easily with maven. So all XAR files the XWiki team
provides should work will work with any encoding.
- java files: should be using only ascii chars
That's about it I think.
[snip]
However, I
would rather use
http://jakarta.apache.org/commons/io/
api-release/org/apache/commons/io/IOUtils.html#toString
(java.io.InputStream) than code it ourselves... Sounds safer,
shorter, less maintenance, etc to me... :)
If it adds value. I think that XWiki is plagued with different
libraries doing the same thing or adding small amount of
functionality. This makes it harder to analyse.
I'm not I would have used the word "plagued" which has a negative
connotation... I would rather have said: "thanks to the effort of
others in OSS we have been able to develop XWiki to a level we
wouldn't have been able to reach otherwise... This allows us to
reduce our maintenance efforts, our documentation efforts and our
testing efforts..." :-)
Now if you notice 2 libraries used in XWiki that do the same thing
let us know so that we can all decide if we want to remove one and
only use one. I'd be in favor of that wherever possible.
I've noticed a few places myself where I think the wrong library was
chosen IMO (like when we use Jakarta ECS for something completely
unrelated). There are also places where the choice was historic: like
using ORO when the Regex is now in JDK 1.4 (this has already been
identified).
Another place where to avoid local encoding: some
source code files
contain French characters, which are messed up on non-8859-1
platforms.
Ah we need to track these down. Could you please let us know which
files?
Thanks
-Vincent