|
从目前的情况看,Mandrake Linux 9.1 支持 GB18030 的可能性应该是比较小了。除非 MandrakeSoft 的资金情况突然好转,或者有相当好的中文开发者加盟,或者XFree86 的新版支持 GB18030。到目前为止,还没有任何一个自由的(Copyleft) GB18030 字体,很显然,Mandrake Linux 的标准版是只有自由软件的。还有,对 GB18030 的支持在技术上也是一个难题。
大家可以参考苏哲先生的文章:《在 XFree86 窗口系统中实现对 GB18030 的支持》:
http://www-900.ibm.com/developerWor...rt1/index.shtml
http://www-900.ibm.com/developerWor...rt2/index.shtml
还可以研究一下 TurboLinux 的 XFree86 对 GB18030 的补丁如何用在 Mandrake Linux 上。地址是:
http://pkgcvs.turbolinux.co.jp/cgi-....0/?hideattic=0
后面的这篇文章是原来在Mandrake i18n (国际化)开发组中关于中文字符集的讨论情况。现在主要开发中文的 Danny Zeng 有好长时间没有开发了,Mandrake 的中文支持堪忧。
From: Pablo Saratxaga
Subject: Re: [i18n] Chinese encoding questions
Date: Tue, 27 Aug 2002 13:12:41 -0700 (PDT)
--------------------------------------------------------------------------------
Kaixo!
Li Wed, 21 Aug 2002 16:48:25 +0800,
"Danny Zeng" <Danny.Zeng@synopsys.com> scrijheut:
"Z> Can someone explain the differences of all the encoding used in Chinese
"Z> locale(s)? And what is the status of their support in Mandrake?
"Z> There is already translation teams for Simplified Chinese
"Z> (gb2312 encoding) and Traditional Chinese (Big5 encoding).
No.
There are translation teams for *simplified* and *traditional* chinese.
gb2312 and Big5 jsut happen to be the most used encodings, but it could
be UTF-8 or EUC-TW or whatever known by iconv.
gettext() is able to do the conversion.
Note that both KDE3 and Gnome2 convert to UTF-8 to use internally.
"Z> It will become too complicated, if we need to translate for all encoding
"Z> supported separately, like gb18030 and GBK for Simplified,
"Z> and new UTF-8 for Traditional Chinese.
No, it is not needed.
In fact gettext() is even able to transcode between simplified and traditional
chinese, eg if you have your locale set to LC_CTYPE=zh_CN.GB2312 and
your LANGUAGE=zh_TW then you will see the traditional chinese translation
in simplified chars...
But I've been told it wasn't always desirable, as there are differnces
in terminology that make those transcoded translations look unnatural.
Also, I'm not sure it still works now that a single encoding (UTF-8) starts
to be used internally almost everywhere.
So, the reason there is "zh_CN" and "zh_TW" po files is because despite
they are both "Chinese", there are differences in writting that justify it;
similarly, there is also "pt" and "pt_BR" as Brazilian portuguese has
some differences in terminology, in particular in computer field (and
computer related words appear a lot in *.po files )
"Z> I understand that GB18030 is a superset of gb2312 and the current
"Z> standard supported by most OS's, and it obsolete GBK.
GB18030 is a norm edited by the Chinese (of PRC) government, and it
is a bijection with unicode (that is, all defined gb18030 codes
have a correspondant unicode code, and vice versa).
It is unicode reorderd insuch a way that all gb2312 codes just happen
to be in the same place in gb2312 and gb18030.
GB18030 is sometimes called the "Chinese UTF-8", as the same way UTF-8
is a way to encode unicode to remain ascii compatible, GB18030 is a way
to encode unicode to remain GB2312.
Now, with current (eg Gnome2, KDE3) and future programs the encoding
sued by the locale will be mostly irrelevant, as utf-8 will be used
internally and it will be perfectly transparent to the user.
Non-UTF-8 locales will be kept only for compatibility, to make the transition
easy.
Currently, there still are several programs that rely on XFree86 locales
mechanism for font choosing (instead of using UTF-8 internally and using
Xft for fonts), and the way XFree86 X11 fonts handling is done makes
it hard, to say the least, to switch to UTF-8 in all cases, as X11 fonts
are supposed to be complete; wich for a locale covering unicode range
(be it an UTF-8 locale or a GB18030 locale) is almost impossible with
an acceptable quality.
"Z> Is GB18030 also a superset of Big5?
In a sense yes, as it includes all characters defined in Big5 too.
But not at the same place; so probably users of Big5 will prefer to go
directly to UTF-8 instead of passing trough gb18030.
"Z> Can I change to GB18030 and still be able to edit my old files
"Z> that use GB2312?
Yes. As all GB2312 codes are at the same place in gb18030.
GB18030 hs been designed to allow just that.
If you use a tool to handle *.po files it will probably also call iconv
to convert to utf-8 internally, so it will completly irrelevant what to
put in the charset= line, as long as it is valid (and includes the characters
you want to type, of course)
On the other hand, programs and tools that don't yet use Xft for font
handling will have a hard time to be able to display it, as X11 still
mostly knows only about *-gb2312.1980 for X11 fonts.
"Z> On the other hand, files in GB18030 encoding may not work on a machine
"Z> support only gb2312, I guess.
If the machine has a gettext implementation using iconv() and has
a mapping table from GB18030 -> GB2312 (which is the case of all modern
GNU/Linux systems) there will be no problem.
In case the "GB18030" encoding will be unknonw on a given lachin, it will
probably be displayed "as is"; so, if the encoding is "GB18030" and
the locale is "GB2312" and it is displayed "as is", it will display nicely
for all chars present in gb2312, only those not in gb2312 will display
as junk.
"Z> Many users like me need to read both traditional Chinese and simplified.
"Z> It looks both GB18030 and UTF-8 promised to handle all the Chinese
"Z> characters, and Japanese and Korean. How do they really work?
Currently it depends of the applications.
For old applications, using X11 fonts, it will probably not work very well;
probably UTF-8 will work better than GB18030.
With newer programs (Gnome2, KDE3, yudit,...) it works perfectly (as long
as you have a font with the needed chars.
I ensured than in upcoming mdk the pseudo-font aliases "sans", "serif" and
"mono" will include the traditional chinese, simplified chinese and
korean chars (for japanese I couldn't, due to the fonts we ship not being
recognized by Xft; but if MS Mincho or Code2000 is installed it will work too)
"Z> Can I input and display all the supported characters in any application?
"Z> Or, how easy is it to add support for this feature in all
"Z> the applications?
Input is however a problem.
Currently the X11 input framework doesn't allow easily switching your input
method (it is your program that has to handle it, and msot don't; I only
knew of "yudit" with that possibility in fact).
Maybe it would be possible with Gtk2 programs ? By changing to another,
gtk-specific input method, changing your XIM, then switching back to the
X input method in the Gtk input method menu choosing...
"Chinput" and "xcin" both supports traditional and simplified, but I don't
think you can switch from one to the other at runtime. (If I'm wrong on this
please tell me)
"Z> Anyone out there is running Chinese desktop using UTF-8, how does it work?
It works better than Greek or Russian in UTF-8 in fact
The main problem is the fonts for old programs not using Xft yet.
And for CJK languages XFree86 just uses legacy encoded fonts (eg the
fonts *-KSC5601.1987-0, *-JISX0208.1983-0, *-GB2312.1980-0) to display
CJK chars.
For modern programs (Gnome2 , KDE3) it works perfectly (but you are limited
to one XIM input method currently)
"Z> Regards,
"Z> Danny
--
Ki 鏰 vos v鍄e b閚,
Pablo Saratxaga
Le galline terrorizzate fanno le uova alla choc?
-- Da it.hobby.umorismo |
|