On Mon, Nov 10, 2014 at 12:03 PM, Eduard Moraru
<enygma2002(a)gmail.com> wrote:
Hi Marius,
On Mon, Nov 10, 2014 at 12:24 PM, Marius Dumitru Florea <
mariusdumitru.florea(a)xwiki.com> wrote:
I'm undecided. As a technical Linux user I
prefer case sensitivity, but I
can see why this is sometimes unexpected for non-technical users. I'm not
sure how you plan to implement this. I know this thread is not about the
technical aspects but still I think it's important to consider the cost
that this change will imply.
First, even for a case insensitive system, I think it's very important to
preserve the case entered by the user. For instance, If I create a user
with the alias 'myCoolAlias' then I wouldn't like to see
'mycoolalias'
displayed in the UI. Same, if I attach a file named
'myCoolPresentation.odp' then I want to see precisely that name on the list
of attachments. So we need to store case sensitive values in the database.
The difference from now will be:
* when creating an entity: check that there's no other entity that has the
same normalized reference (toLowerCase/toUpperCase)
* when retrieving an entity: look for normalized references
This means we'll have to call toLowerCase/toUpperCase very often so we need
proper database indexes. Otherwise we'll have a performance impact.
Right. I posted a similar comment on the Jira issue.
Another interesting idea I was just thinking about would be to add
uniqueness constraints in the database on lower(doc.fullName) for instance.
That would cover the creation step.
Regarding performance, I have found in some quick searches that uniqueness
constraints are also achievable through unique function-based indexes [1].
On both cases, we would have to investigate on the database support or
solutions for using a function like lower() for the database unique
constraint or the unique index.
----------
[1]
http://stackoverflow.com/questions/3944840/create-unqiue-case-insensitive-c…
Second, we have lots of places that query the
database and since we have to
store the raw case-sensitive values then we need to update all this places.
Moreover, since it's not about a single field/column I'm not sure we can
write a query filter to lower the case automatically. Then we also have a
lot of extensions that query the database and that create entities. Those
will have to be updated too.
Not sure a filter would be the solution, since queries should be
application-specific.
Lastly, AFAIK lower case and upper case are locale dependent. The 'lower'
query function doesn't have a locale parameter so it depends on the locale
the database has been configured with. So there can be cases when a user
won't be able to retrieve an entity using some locale dependent lowercase
version of the reference because the database computes the lower case
differently than what the user expects (because it uses a different
locale).
Well, as long as we do something like lower(myValue) = lower(dbValue)
instead of myValue = lower(dbValue), we should be good, right? (i. e. apply
the same databse collation/logic on both the searched value and the db
value so that we never get a missmatch).
Not really, the problem you will have is that lower is not going to do
it's job because you are not in the right locale. That means that you
might end up with the request thinking that two values are not equals
while if lower did its job properly it would have been. In XWiki side
we will apply the locale to String.toLower and we will get a different
result which means that you have cases where in the require two
document name are not equals but they are when you compare two
DocumentReference which might create many issues.