Hi Zeljko, On Apr 7, 2007, at 7:49 PM, Zeljko Trogrlic wrote: [snip]
3) I see that in our standalone installation we use - Dfile.encoding=iso-8859-1. Now that I've read Joel's tutorial it seems to me this is not going to work for everyone and that we should rather use -Dfile.encoding=UTF-8 by default. WDYT?
That is problem if it's not your default encoding. You have two options:
* use platform default encoding and don't use non-ASCII characters in default configuration * use UTF-8
Although UTF-8 sounds better, note that you: * need an editor that supports it, otherwise local encoding will creep in * encoding must be set manually because encoding can't be detected for plain text files * you have to communicate this very clearly to users * text will look funny in non-UTF-8 editor and it will be hard to change it
Let's look at the files xwiki manipulates: - config files. These ones should only contain ASCII characters and unicode code points when there's a need as with resource bundles for example. Thus all encoding will work there. - XAR files. If these are created with XWiki (with an export) they'll use the file.encoding specified so if it's utf8 they'll be saved in utf8. In addition, I propose that in our build we run native2ascii for all our data files (including the XAR files). This can be done automatically easily with maven. So all XAR files the XWiki team provides should work will work with any encoding. - java files: should be using only ascii chars That's about it I think. [snip]
However, I would rather use http://jakarta.apache.org/commons/io/ api-release/org/apache/commons/io/IOUtils.html#toString (java.io.InputStream) than code it ourselves... Sounds safer, shorter, less maintenance, etc to me... :)
If it adds value. I think that XWiki is plagued with different libraries doing the same thing or adding small amount of functionality. This makes it harder to analyse.
I'm not I would have used the word "plagued" which has a negative connotation... I would rather have said: "thanks to the effort of others in OSS we have been able to develop XWiki to a level we wouldn't have been able to reach otherwise... This allows us to reduce our maintenance efforts, our documentation efforts and our testing efforts..." :-) Now if you notice 2 libraries used in XWiki that do the same thing let us know so that we can all decide if we want to remove one and only use one. I'd be in favor of that wherever possible. I've noticed a few places myself where I think the wrong library was chosen IMO (like when we use Jakarta ECS for something completely unrelated). There are also places where the choice was historic: like using ORO when the Regex is now in JDK 1.4 (this has already been identified).
Another place where to avoid local encoding: some source code files contain French characters, which are messed up on non-8859-1 platforms.
Ah we need to track these down. Could you please let us know which files? Thanks -Vincent