On 11/10/2010 04:25 PM, Paul Libbrecht wrote:
Caleb,
I think that any performance-oriented comparison of keys will use indices.
And in this case, the comparison between strings (even fairly long) and binary bits has
the exact same performance.
I remember reading this which said that UUIDs should be stored as binary rather than
varchar.
http://kekoav.com/blog/36-computers/58-uuids-as-primary-keys-in-mysql.html
It appears that inserts can be given as SET id=0x1e8ef774581c102cbcfef1ab81872213
and (in mysql) can be returned in human readable form using SELECT HEX(id) FROM...
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_hex
Regarding indeces, my understanding is an index makes the column read as if it were
sequential thus
allowing for a binary search. Still a binary search means the number of load and compare
operations
is log2(length of table) so in a million entry table, 20 lookups are needed to find an
entry (if the
index is up to date). Perhaps I am splitting hairs on this, I know premature optimization
is often a
fool's errand.
Another issue which is more difficult to measure is memory bounding. If a primary key
column can be
loaded into the processor cache then lookups will be orders of magnitude faster because
loading from
ram memory can take around 500 processor cycles. Unfortunately this kind of thing is
rather magical
unless one is programming in assembly.
Those are my points, you are a user so I'm most interested to hear how you would
benefit from
storage as a string.
Caleb
paul
On 10 nov. 2010, at 16:50, Caleb James DeLisle wrote:
Is there
any reason to use a varbinary instead of a simple string?
At least string would be easier to export and develop with.
The reason is because varbinary will occupy 16 bytes while a string will occupy 36. This
becomes
important when the database has to find an entry, This will require comparison of the
entries and 16
byte entries will compare about twice as fast as 36 byte entries. If UUIDs are used for
everything
then it's conceivable to have in a document with 10 objects each having 5 properties,
100 lookups
per getDocument. So it's a question of performance vs ease of use (when accessing the
database via
sql).
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs