Hi Sergiu,

I think the API does not have to be changed, only the implementation of the API.

I'm currently running a version of xwiki that allows docs in french and japanese (including doc titles/ids) with minimal changes, it's not accessible from outside though, so I can't let you test it...

It works well, as it uses UTF-8, but it uses the getURLEncoded method. The change I'm proposing will make xwiki work in other encoding as well as it is now hardcoded in UTF-8.

Remains the problems of the tests which by essence, will need to be done at the functionality level and they may have to be tested on several browser to be sure they handle URLs the same way. Do you have means to make some high level tests somewhere ?

Regards, Gilles,

On 27 févr. 07, at 01:26, Sergiu Dumitriu wrote:

How safe are these changes? Like, will any existing code require changes, will it work for any encoding, any APIs must be changed?

On 2/26/07, Gilles Serasset <Gilles.Serasset@imag.fr> wrote:
Hi all,

I'm currently working on allowing xwiki to manage documents (and
their urls) in utf-8 or other non latin1 encodings.

I saw that the method:

  public static String getURLEncoded(String content)
     {
         try {
             return URLEncoder.encode(content, "UTF-8");
         } catch (UnsupportedEncodingException e) {
             return content;
         }
     }
in XWiki class is hardcoded in UTF-8, which is strange as the default
encoding of xwiki is iso-latin-1.

The method is used in the core source:
1. To prepare "Content-disposition" headers for the responses (for
package export and file download)
    --> it encodes the filename for file downloads.
2. To generate ids of TOC in TOCGenerator

It is also used through velocity macros (mainly editrights, to allow
passing of a full URL, with GET attributes as a simple attribute
value usually for xredirect).

Hence it is a problem as soon as a document can have an url involving
non ascii characters.

Currently, everything works because the encoded URL do not include
non ascii chars as it is used in few places, but this method will
pose problem even in a default wiki (i.e. latin1) settings. Moreover,
this method is static and it is not possible to fetch the current
xwiki encoding.

So I propose to:

1. make this method non static and use the xwiki configuration to
specify the encoding to be used...
2. propose a way to encode filenames of content disposition which is
compatible with RFC 2231 which allows the specification of filenames,
even if they do contains non ascii chars (names in japanese of thai
for instance...)

Does anyone object against this proposal ?

Regards, Gilles,
--
Gilles Sérasset
GETA-CLIPS-IMAG (UJF, INPG & CNRS)
BP 53 - F-38041 Grenoble Cedex 9
Phone: +33 4 76 51 43 80
Fax:   +33 4 76 44 66 75



--
http://purl.org/net/sergiu

--
You receive this message as a subscriber of the xwiki-dev@objectweb.org mailing list.
To unsubscribe: mailto:xwiki-dev-unsubscribe@objectweb.org
For general help: mailto:sympa@objectweb.org?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws

--
Gilles Sérasset
GETA-CLIPS-IMAG (UJF, INPG & CNRS)
BP 53 - F-38041 Grenoble Cedex 9
Phone: +33 4 76 51 43 80
Fax:   +33 4 76 44 66 75