On 11/01/2011 11:07 PM, Yang Li wrote:
? 2011/11/2 10:46, Sergiu Dumitriu ??:
Hey community,
I spent some time trying to make PDF export work for CJK (Chinese,
Japanese, Korean) characters, and managed to get it working quite well.
Searching for some good open source fonts, I finally decided on the
following:
- CJK Unifonts (Linux re-packaging of the Arphic fonts)
- IPAGothic
- Baekmuk
The first one comes in two variants, serif (a.k.a. ming or song) and
script (regular script, kai), and has good support for Chinese, with
good, but not complete, support for Japanese, and no support for
Korean. It looks very good in both variants, but we should decide on
one of them. I uploaded samples on
http://jira.xwiki.org/browse/XWIKI-7106 to see how they would look.
***
Q1: Should Kai or Ming be used as the default export font for Chinese?
***
I'm far from being an expert here, but my opinion is that the Kai
variant, with it's handwritten look, is better suited for printed
material. Still, PDFs are also used on screen, be that a large
computer monitor or a handheld device, and on screen the legibility of
the Ming variant is better. One option that I like is to use Kai for
normal text and Ming for tt/code elements, as a kind of monospace.
As a Chinese, I strongly recommend *song* , the most poluar font, and we
seldom use ming in official documents...
Reading on Wikipedia
http://en.wikipedia.org/wiki/Ming_%28typefaces%29 I
got the impression that there isn't a clear distinction between Ming and
Song, and some refer to the same thing with both terms. Looking at the
list of CJK fonts
http://en.wikipedia.org/wiki/List_of_CJK_fonts none of
the fonts that have Song in their name are under an open source friendly
license, so they can't be redistributed. Please take a look at the
sample PDF and see if it is acceptably similar to Song:
http://jira.xwiki.org/secure/attachment/23886/ming-over-freefont.pdf
The second
font, IPAGothic, is centered on Japanese, so it has good
support for Japanese, some support for Chinese, and no support for
Korean. It is a sans-serif variant.
The third font, Baekmuk, brings support for Korean (laking from the
other two fonts), along with little support for some Chinese and
Japanese characters. This one comes in more variants, but only two are
complete enough to be considered, Batang as the serif equivalent, and
Gulim as the sans-serif equivalent.
***
Q2: Should Batang or Gulim be used for Korean?
***
My opinion is that the serif variant looks better on print, although
less readable. Still, I've seen Gulim much more often used in
practice. I attached two samples for this as well to the Jira issue.
***
Q3: Should the current FreeSerif font be used for non-CJK characters,
or the font face defined in the font specific to each language?
***
While I prefer FreeSerif for all English text, I've seen in practice
that the preferred solution is to use a bulkier font for numbers and
latin characters.
FreeSerif is good.
OK, noted.
***
Q4: Does italics/oblique make sense for CJK characters?
***
The concept of Italics is defined only for latin-like characters, and
no font provides support for italics CJK. Still, Firefox does render
slanted characters for CJK text inside <em>. FOP, the rendering engine
used for generating PDFs, does not have support for automatically
slanting fonts that don't provide an italics variant, and will insist
on choosing a font that comes in an italics variant. So, this means
that by default any text that is emphasized in the wiki will not be
displayed in the PDF correctly (they would appear as # characters).
There is a simple solution, and that is to alter the font file so that
is says that both the regular and italic version of the font are in
the file. Another option is to actually provide an oblique version of
the font, which FontForge seems to be able to do quickly and with good
results. Still, this will double the size of the fonts, so I'd rather
not provide italic fonts if they don't actually make much sense for
native CJK users.
In fact, Chinese people use bold font to emphasize (hei), not italics,
and we seldom use italics.
OK, so this means that italics doesn't make sense, which is good.
The bad news is that FOP doesn't support making characters bold when
there's no predefined font, either, but it won't fall back to a font
that does provide bold. This means that bold text will appear the same
way as regular CJK.
I tried to generate a bold font from FontForge, but it fails with an
error message: "some fragments did not join". So, our hope is that FOP
will implement this feature soon.
Some other fonts that I looked at were:
* the Droid font used in Android devices, which is a sans-serif font
IMO not suited for print; its advantage would be that it provides a
unitary look for all CJK languages, less good looking, but more legible
* the Hanazono font, which has impressive support for all the
characters in CJK Unicode sets, but was created in a wiki way, so IMO
it's not very consistent throughout the whole spectrum, and not as
esthetically looking as the others
***
Q5: Should a less good looking, but smaller and more consistent font
be used? If yes, which one?
***
The Droid font is actually quite small compared to the others, and on
smaller font sizes it is more readable.
I prefer normal fonts, because nowadays we usually use a browser and
large display @@..
I would really appreciate some feedback on this topic.
--
Sergiu Dumitriu
http://purl.org/net/sergiu/