Vincent Massol wrote:
Sounds good to me. Do you want to add this on
dev.xwiki.org in the dev
practices?
Yes.
Do we have examples for 5? Velocity template files
maybe (although I'd
suggest allowing only ISO chars in them)?
As I said in the first paragraph, javascript extensions and skin files.
Although people should use $msg.get for all strings, nothing can stop a
dev/user to manually write alert("ţâşpe €") for a custom app that
doesn't need to be localized, which will fail if the file is saved with
a different encoding than what the platform expects.
This is also causing
http://jira.xwiki.org/jira/browse/XWIKI-2657 , and
the need for a fixed encoding occurred while reviewing this patch.
On Nov 10, 2008, at 4:39 PM, Sergiu Dumitriu wrote:
> Hi devs,
>
> Until now, filesystem resources were not forced to a specific encoding
> (except ResourceBundle translation resources, which are forced by the
> spec to contain only ISO-8859-1 characters and unicode escapes). And
> the
> number of files not being ASCII was kept to 0, thus a policy wasn't
> needed. However, it is better to set a rule, in case third party
> developers need to place non-ascii characters in source files, such as
> JavaScript or CSS extensions and skin files. So, here are some
> proposed
> rules we should make public on our dev site, and follow ourselves.
>
> 1. All Java source files must contain only ASCII chars, unicode
> escapes
> inside strings when needed, and xml entities in javadocs. Since we
> don't
> use @author tags, this should not be a problem.
>
> 2. All translation files contain only ASCII chars and unicode escapes
> (stronger than the spec).
>
> 3. All wiki documents sources must be stored in UTF-8.
>
> 4. Other XML files should always specify their encoding in the <?xml
> header, and it should be as often as possible UTF-8.
>
> 5. All other textual resources must be stored in UTF-8, minimizing the
> use of non-ASCII chars.
>
>
> The changes are that:
> 1: This is the practice we were already using, but we didn't have a
> written rule on this.
> 2: This is the practice we were already using, but we didn't have a
> written rule on this, except in the "Contributing" page.
> 3: Wiki sources are currently in ISO-8859-1 because our default
> package
> ships with that encoding, and XML exports are usually done from the
> default package. This is not really a problem, since the XML reader
> can
> detect and use the encoding specified inside the document itself.
> 4: Not a strong requirement, but a suggestion only. Most of our XMLs
> are
> currently using ISO-8859-1, but since they only contain ASCII chars,
> it
> doesn't really make a difference.
> 5: There was no rule on this, and the resources were always read using
> the system encoding, which means that our package is not 100% portable
> now, unless we force people to set a specific JVM encoding. I'd like
> to
> force UTF-8 as the encoding for this kind of resources since it is
> hard
> to represent all the characters in 8bit encodings.
>
> WDYT?
>
--
Sergiu Dumitriu
http://purl.org/net/sergiu/