On 04/23/2010 08:50 AM, Denis Gervalle wrote:
On Fri, Apr 23, 2010 at 03:32, Sergiu
Dumitriu<sergiu(a)xwiki.com> wrote:
On 04/06/2010 05:03 PM, Vincent Massol wrote:
Hi Milind,
On Apr 6, 2010, at 5:00 PM, Milind Kamble wrote:
> Denis,
> I understand your point that XE being used globally, needs to support
more
than Ascii char set.
> While the new reference model matures, could
you clarify if underscore
in a file name would break the functionality under the
current model where
attachment name is used as a reference for attachments? If not, would it be
possible to eliminate the stripping of just the underscore chars and push
that fix in the next XE release -- I am OK with space chars getting stripped
off.
I don't think that underscores are a problem even with the old "reference
as string" code. Actually I don't even know why we're stripping them. Sergiu
might know more. Any idea Sergiu?
This is the issue that started it: XWIKI-2087
So, there were three main problems:
1. Impossible to actually restore the attachment from the database since
the ID was generated using the hash of the original, correct name, yet
it was stored using the broken name, with ? instead of non-latin1
characters
2. Impossible to link to such an attachment, since a non-UTF wiki would
encode non-ASCII chars to their&#xyz; escapes, and the filename wasn't
decoded when trying to get the attachment from the database
3. Encoding bug in the old WYSIWYG which composed the URL using a wrong
encoding
3 should be fixed since we're forcing UTF-8 in URLs.
2 and 1 should work if the wiki+database are using UTF8, but they might
still fail in latin1.
Should we really support non-UTF-8 configuration ? We have already lost so
much time with these encoding issues, and I really do not understand the
advantage of supporting non-UTF8 environment ?
Legacy. Maybe if we can provide a nice and quick guide for transforming
a latinX installation into an UTF-8, we'd be allowed to require UTF-8.
We could announce that from 2.5 onwards UTF-8 will be mandatory, if we
decide to go this way. Maybe the most important latin1 installation is
xwiki.org itself.
The most problematic thing is that by default mysql databases come as
latin1 (in most distributions, although my Gentoo makes it utf8), and
this is one of the most frequent source of encoding problem reports.
>> Thanks
>> -Vincent
>>
>>> ________________________________
>>> From: Denis Gervalle<dgl(a)softec.lu>
>>> To: XWiki Developers<devs(a)xwiki.org>
>>> Sent: Tue, April 6, 2010 8:30:34 AM
>>> Subject: Re: [xwiki-devs] Simple patch to enable/preserve underscore
> chars in attachment file names
>>>
>>> On Tue, Apr 6, 2010 at 14:02, Guillaume Lerouge<guillaume(a)xwiki.com>
> wrote:
>>>
>>>> Hi Milind,
>>>>
>>>> On Tue, Apr 6, 2010 at 1:23 AM, Milind Kamble<mbkads(a)yahoo.com>
> wrote:
>>>>
>>>>> Hi. I would like the dev community to evaluate this simple fix that
> will
>>>>> enable uploading of files with underscore chars in the file name
when
>>>> users
>>>>> perform the attach action. Our user community is quite impressed
about
>>>> the
>>>>> refreshing ease of use and the power, flexibility in their
> collaboration
>>>>> work flow made possible by XE. They would like to escape the tyranny
> of
>>>>> Microsoft-MOSS as early as possible and the main roadblock to do so
is
>>>> the
>>>>> stripping of space and underscores from file names which were
created
> in
>>>> a
>>>>> MS-Office centric environment.
>>>>>
>>>>
>>>> I can't do much about your underscore problem (though I promise
I'll
> poke
>>>> the developer sitting right next to me so that he looks at it).
>>>>
>>>
>>> I was already aware of this issue, and I have had similar problemqs with
>>> attachment, not only with "_", but also with accentuated chars
etc...
>>> Restriction on attachment names will be easier to be changed when the
> new
>
>>> model model using references will be fully in place, since attachment
> names
>>> are currently used as reference for attachments. Be sure I will take
> care to
>>> have it improve.
--
Sergiu Dumitriu
http://purl.org/net/sergiu/