[xwiki-devs] The danger of API proliferation. - xwiki-devs@xwiki.org

7 May 2014

Hi folks,
I saw an issue to rename a method ( http://jira.xwiki.org/browse/XWIKI-10311 )
in XWikiContext. I don't want to single out the author, the commit is not an
outlier, on the contrary it represents a pattern and this worries me.
I've seen this same pattern in Microsoft technology and while each decision in
isolation makes sense, the sum of all changes ended up making Windows an
unapproachable API garbage heap which may be one reason why everyone left the
desktop to develop for the web.
Take for example the humble printf() function, standardized in POSIX.
At some point a somebody at Microsoft noticed all of the security problems with
C programs and decided to make a set of "hardened" functions which would be more
secure. Some of them had excellent security features but others did just a little
extra checking, each had the suffix "_s" added to the function name.
Because it's not feasible to do any other checks, printf_s() just checks if the
inputs are NULL and if so, raises an exception. Developers who had been using
printf() were told that they were making bad insecure code and had to begin using
printf_s().
Windows was being used in many countries with different alphabets and at the time,
the only way to represent different languages was using the high 128 characters
left undefined by the ASCII standard, but since each language needed a different
128 characters, Windows needed to know what language the string should be
interpreted in so somebody invented printf_l() which took an extra parameter which
was the *locale*. Of course printf_s_l() because people still need to write secure
code!
Eventually this silly idea of locales and code pages was replaced with Unicode
(specifically UTF-16). Unicode characters being 16 bits wide needed to be handled
differently from their 8 bit cousins so a new set of functions was written
beginning with w prefix. wprintf() wprintf_s() and to make porting easier for
programs which had been written to use "_l" functions, they added wprintf_l()
and wprintf_s_l() even though they should not have been strictly necessary.
Unfortunately after everyone had rewritten their programs to use the new and
improved wchar_t and the w* functions, someone realized that not all computers
even support Unicode! So they rushed to implement a new character type called
tchar_t which is a wide character if-and-only-if the computer supports it.
When tchar was invented, they had 4 printf functions to port and so came
tprintf() tprintf_s() tprints_l() and tprintf_s_l().
Then some smart guy at IBM took another look at this Unicode idea and realized
that with clever encoding, one could make a character representation which uses
1 byte to represent ASCII letters and more bytes only when it needed to represent
different alphabets. Thus UTF-8 was born. The best part was it was fully backward
compatible so you could use this with normal printf()!
In the Linux world, most of this drama just never occurred, people kept on using
printf() as if the rest of the world never existed and when UTF-8 was devised,
Linux programs could suddenly speak Chinese. Microsoft, apparently ashamed of
their 11 printf functions but still unable to take a lesson, deprecated them all
and went on to create .NET which standardized around Console.WriteLine() (for now).
While Linux and friends now all transparently support UTF-8 and it's alphabets,
Windows is still unable to support Unicode filenames without first converting the
string into wide character format and then using the old 'w' prefixed functions.
Even then, UTF-8's scaling byte count supports more characters than UTF-16 so
Windows *still* has a hard time with some languages!
The problem with all this is you lose coherency, the community fragments and if the
scary plethora of methods is not enough to scare off new developers, there are the
angry old developers standing by to haze them for using insecure/unportable/deprecated
functions instead of the shiny new ones. I think we should think a lot harder before
either deprecating a method or adding duplicate functionality.
Anyway my 2¢
Thanks,
Caleb