Japanese page
XTerm
I am working on internationalization (i18n)-related improvement of
XTerm, which is included
in the distribution of XFree86
and is the most widely used terminal emulator on X Window System
in the world.
"Locale" is a mechanism
of ISO C and
UNIX
to consistently specify behavior of softwares
which are related to languages, customs, and cultures in the world.
For example, error messages should be shown in the language which
"locale" specifies. There are several locale categories, each of
which is related to a certain field of culture or software feature.
LC_CTYPE category is one of them to specify the encoding to be used
for every stream I/O. Thus, a software must output messages in
ISO-8859-1 in ISO-8859-1 locale, EUC-JP in EUC-JP locale, UTF-8
in UTF-8 locale, KOI8-R in KOI8-R locale, and so on.
Note that support of UTF-8 should also be implemented in
the framework of locale, i.e., softwares should use UTF-8
if LC_CTYPE locale orders it. In other words, all what
users who want to use UTF-8 have to do should be to set LANG
variable to *.UTF-8 .
Softwares which use different encoding than the encoding
which is specified by the current locale should be regarded as buggy.
Thus, these my works should be regarded as bugfixes rather than
improvements.
There are many properly internationalized softwares which obey
LC_CTYPE locale to do any stream I/O. To work such good softwares
to work well, terminal emulators have to properly display the
messages which softwares send to the terminal. This is why
terminal emulators have to obey LC_CTYPE locale to determine
the encoding to be used.
Following works are related mainly to this point.
- (2002-09-15) Though internationalization (i.e. LC_CTYPE
locale sensibility) has almost finished on 2002-08-17 patch,
automatic font selection was not implemented. This means,
when XTerm automatically uses UTF-8 mode (luit-using
locale-sensible mode also uses UTF-8 mode internally),
*-iso10646-1 fonts should be used automatically instead of 8bit fonts.
However, since XTerm could not have separate configurations
for both of conventional 8bit mode and UTF-8 mode, such
automation was impossible.
Thus I wrote this
patch. This enables XTerm to have font setting for each of 8bit and
UTF-8 mode. Fonts for UTF-8 mode are automatically used when XTerm
uses UTF-8 mode.
download:
XTerm (cvs 20020817),
font patch.
- (2002-08-17) My 2002-07-18 patch was integrated into CVS
repository of XFree86. Now you can use locale-sensibility
without any of my patches.
- (2002-07-18) I submitted patch
for xterm to patch@xfree86 and got sequence
number of 5328. Now all I have to do is to wait for this patch to
be processed. Then we will use various encodings by XTerm!
By improving luit, XTerm will support more encodings. (For example,
TCVN, GBK, and Shift_JIS will be supported by using 2002-07-04
patch).
To try this patch, you will also need to add
--enable-wide-chars when you invoke ./configure.
You don't need this if you use xmkmf.
Please use *-iso10646 fonts which are distributed by XFree86.
If you use
-misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1,
XTerm will automatically detect
-misc-fixed-medium-r-normal-ja-13-120-75-75-c-120-iso10646-1
for east Asian doublewidth characters.
download:
XTerm (cvs 20020513),
patch.
- (2002-07-04) A new
patch for luit was posted to the XFree86 i18n mailing list.
It supports GBK, Shift_JIS, and UTF-8.
download:
luit (cvs 20020701)
(required),
UTF-8 patch
(required),
fontenc patch
(for XFree86 4.1 or before).
Old Informations
- (2002-06-07) Based on luit's improvement in 2002-06-05,
I sent a mail with an XTerm patch to
XFree86
i18n mailing list and linux-utf8
mailing list.
I will send the patch to formal patch acceptance address of XFree86
(patch@xfree86) after discussion on these mailing lists.
download:
the patch.
- (2002-05-14) Updated the patch for XTerm to call luit to
support various encodings (such as ISO-8859-2,3,11,15, KOI8-R,
EUC-JP, BIG5, etc). This patch is a renewal of my patch
in 2002-02-03 to catch up with the upstream XTerm.
In EUC-JP locale (and other east Asian locales),
this has a problem that many characters (mainly symbols)
which should be doublewidth are displayed in singlewidth.
To try locale-sensiblity, you will need a resource setting of
"XTerm*Locale: true". You will also need to add
--enable-wide-chars when you invoke ./configure.
download:
XTerm (cvs 20020513),
patch to invoke luit.
- (2002-06-05) luit in
CVS tree of XFree86 was
improved and we don't need "misc bugfix/improvement patch" in 2002-05-13.
download:
luit (cvs 20020606)
(required),
fontenc patch
(for XFree86 4.1 or before).
- (2002-05-13) luit in
CVS tree of XFree86 is
newer than verion 0.8.1 which is downloadable in
Juliusz's
page.
However, you will need
the "misc bugfix/improvements patch". This patch is needed to use
luit from XTerm, because it solves a problem around "--" command
line option.
This patch is a result from discussioin in XFree86 i18n mailing list
in
Feb 2002
and
here
is a summary. (I resubmitted the patch with sequence number #5279.)
Note that CVS version of luit requires XFree86 4.2 or later to
be compiled. If you have previous version of XFree86, you can
use "fontenc patch".
Download :
luit (cvs 20020513)
(required),
misc bugfix/improvement patch
(required),
fontenc patch
(for XFree86 4.1 or before).
- I submitted a patch
for xterm to obey LC_CTYPE locale to determine the encoding, by
calling
luit
internally. You have to configure Unicode font and have
"XTerm*Locale: true" resource
to try this feature. You also have to use --enable-wide-chars
option for ./configure.
- XFree86 4.2 was released on
20 Jan 2002. This includes
luit,
a small software to convert between UTF-8 and other various encodings.
Note: you will have to have XFree86 4.2 to compile luit which is
included in XFree86 4.2 (or CVS). On the other hand, you can
download luit from the above page written by
Juliusz Chroboczek
and the version can be compiled alone.
XFree86 4.2 also includes many bugfixes around multibyte encodings.
- XTerm-158 was released on
8 Sep 2001. This version supports OverTheSpot preedit type of XIM.
XFree86 CVS version fixed
a bug aroung XIM of XTerm-158. You will need this to input Korean
using a Korean XIM server
Ami. OverTheSpot
is a convenient preedit type also for Chinese and Japanese.
- I submitted
xterm-156-overthespot2 patch
on 14 June 2001. This patch adds OverTheSpot preedit type support to
XTerm and use Xutf8LookupString() to receive XIM string as UTF-8, the
internal encoding of XTerm. Note this patch is not related to previous
works of Robert Brady and mine.
Though this patch has much less features
than my previous patches, I think this way will help earlier integration
of minimal XIM and CJK languages support to the official XTerm.
Especially, the previous patches include
fribidi which is
distributed under LGPL, which causes a license problem.
You will need XFree86 4.x with
cstomb
fix patch.
[Download xterm-156.tar.gz from
this site]
- Robert Brady released
xterm-152-27
on 23 March 2001. This includes all my patches for xterm-150-23 (see
below).
[Download xterm-152.tar.gz from
this site]
[Download xterm-152-27.diff.gz
from this site]
Much Older Informations
My work is based on:
To try internationalization, invoke ./configure with --enable-wide-chars.
You can use fonts of
-misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1
(for normalwidth characters)
and
-misc-fixed-medium-r-normal-ja-13-120-75-75-c-120-iso10646-1
(for doublewidth characters) which are included in XFree86 4.0.
Please type
xterm -fn -misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1
and XTerm can automatically detect
-misc-fixed-medium-r-normal-ja-13-120-75-75-c-120-iso10646-1
font for doublewidth characters.
Please test. Reports on testing or bugs and patches are welcome.
Especially, since I use only Debian GNU/Linux, I can hardly test
configure script. If you are interested in the development and
you'd like to discuss with me (or other developers), please join
XFree86
Internationalization mailing list.
Known bugs are:
- XIM input doesn't work under UTF-8 mode under OSes which
don't support nl_langinfo(3).
- XIM input doesn't work under FreeBSD-4.2-RELEASE and Bruno's
libiconv.
The followings are my patches. You can download and test.
- xterm-150-23-k10
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- Check BOM addition by iconv() (for BSD)
- Check libxpg4 in ./configure (for BSD)
- Use Bruno's iconv() check in ./configure. However, I added
--with-libiconv[=DIR] and check for locale_charset()/nl_langinfo().
- xterm-150-23-k9
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- Removed strndup() which is not a standard function.
- Removed WCHAR_T in iconv_open() (wcwidth.c).
- Added setlocale(LC_ALL,"") according to the report of FreeBSD person.
- Wholly rewrote ./configure macros for checking iconv() and
nl_langinfo().
- Names for UTF-8 and UCS-4 in iconv_open() are determined
by ./configure.
- added --with-libiconv and --with-libcharset for ./configure.
- xterm-150-23-k8
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- XIM input have caused abnormal exit of XTerm under UTF-8 mode.
This bug is fixed.
- Manual page is rewritten. (-8, -en, -lc, -u8, encodingMode)
- xterm-150-23-k7
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- Added a command option '-en' to directly specify encoding.
This is a stopgap for OSes which don't have nl_langinfo()
but have iconv().
- Name of iconv checker in aclocal.m4 is changed from AM_ICONV to
CF_ICONV to follow naming policy of XTerm.
- Checker for nl_langinfo() for aclocal.m4 was written.
If nl_langinfo() is not available, try to use locale_charset()
in libcharset by Bruno. (not tested).
- Checker for wcwidth() for aclocal.m4 was written.
- Fixed a bug that nl_langinfo() is never used in k6 patch.
- xterm-150-23-k6
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- Changed name of command option for bidi from -/+b to -/+bi
because '-b' is already used for different purpose.
- Changed the algorithm to determine default width mode when
system's wcwidth() is not available.
- Modified manual:
added descriptions for command options (-/+bi, -fx, -8, -u8,
-lc, -wcs, -wcu, and -wcc) and resources (bidi, ximFont,
encodingMode, and widthMode).
- removed descriptions for command option (+u8) and
resource (utf8).
- Rewrite source codes assuming the following macros are available:
HAVE_ICONV, HAVE_LANGINFO, and HAVE_WCWIDTH.
- Rewrite UXTerm.ad ("*VT100*utf8: 1" -->
"*VT100*encodingMode: utf8").
- Added Bruno's AM_ICONV to aclocal.m4 . Am I right?
- Added basic check for nl_langinfo() and wcwidth() (just like for
iconv() in the previous configure.in). Am I right?
- xterm-150-23-k5
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- Compilation problem with GNU libc 2.1 was fixed.
- xterm-150-23-k4
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- Fixed the column problem of XTerm when ./configure without
--enable-wide-chars. This is my fault in the previous patch.
- xterm-150-23-k3
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- XIM and Unicode keysym can co-exist.
- wc* are defined as resources.
- rename -wc* options. corresponding resource is 'widthMode'
(class 'WidthMode').
------------------------------------------------------------
option parameter for 'widthMode' note
------------------------------------------------------------
-wcs 'system' or 'locale'
-wcu 'unicode', 'standard', or 'markus' previous '-wcm'
-wcc 'cjk', 'eastasia', or 'doublewidth' previous '-wcl'
------------------------------------------------------------
- restructure encoding-related options.
corresponding resource is 'encodingMode' (class 'EncodingMode').
---------------------------------------------------------------------
option parameter for 'encodingMode' default for...
-8 '8bit' 'C' and 'POSIX' locales,
ISO-8859-* (except for 6 and 8)
-u8 'utf8' or 'utf-8' UTF-8 locale
-lc 'locale' or 'lc_ctype' other locales
---------------------------------------------------------------------
- thus abolished '+u8' option and 'utf8' resource.
- xterm-150-23-k2
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- Remove compilation error when ./configure without options.
(Replace PAIRED_CHARS with TRI_CHARS, and so on).
- Remove compilation error when ./configure --enable-trace.
(Replace PAIRED_CHARS with TRI_CHARS, and so on).
- Confusion of wchar_t and UCS-4 is fixed in LocalEncodingToUnicode().
- Endian problem of LocalEncodingToUnicode() is fixed.
- Confusion of wchar_t and UCS-4 is fixed in my_wcwidth().
(Code for conversion from UCS-4 to wchar_t is newly written.)
- Limitation of do_precomposition() of 16bit Unicode is fixed.
- Possible bug fix in ScreenWrite().
(str3 should be shifted 16, not 8).
- Integrate XIM patch from Debian person to enable XMODIFIERS
variable. (VTInitI18N())
- xterm-150-23-k1
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #23.
- wcwidth_cjk() depended on system's wcwidth().
- xterm-150-22-xim2
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #22.
- OverTheSpot is supported.
- New command option '-fx' and a resource 'ximFont' are
introduced.
- xterm-150-22-xim
(announcement)
Based on Thomas Dickey's XTerm #150 and Robert Brady's patch #22.
Tomohiro KUBOTA
<debian at tmail dot plala dot or dot jp>