X Window System, Version 11
Input Method Specifications
Public Review Draft - November 1990
(Send comments to i18n@expo.lcs.mit.edu)
Vania Joloboff
Open Software Foundation
Bill McMahon
Hewlett Packard Company
ABSTRACT
This chapter addresses the portability and interoperability of
programs in different countries. It describes specifications pro-
viding to clients of the X Window System Version 11, an interface
for input handling of characters in various languages. The
specifications make it possible to develop portable applications
independent of a particular language or a particular encoding of
characters. The specifications are consistent with related
specifications from X/Open Portability Guide, Release 3, and
ANSI-C. The reader is assumed to be familiar with those, particu-
larly with the notion of locale in the C language, therefore they
will not be detailed here.
Copyright c 1990 by the Massachusetts Institute of Technology.
Permission to use, copy, modify, and distribute this documentation for any pur-
pose and without fee is hereby granted, provided that the above copyright
notice and this permission notice appear in all copies. MIT makes no represen-
tations about the suitability for any purpose of the information in this docu-
ment. It is provided "as is" without express or implied warranty. This docu-
ment is only a draft standard of the MIT X Consortium and is therefore subject
to change.
1
XIM Public Review Draft
X Window System is a trademark of the Massachusetts Institute of Technology.
2
XIM Public Review Draft
1. Input Method Overview
The next paragraphs provide definitions for terms and concepts used in the
specification, and a brief overview of the intended use of the abstractions
developed for Xlib internationalization.
1.1. What are Input Methods ?
A large number of languages in the world rely on an alphabet, a small set of
symbols (letters) used to form words. To enter text into a computer in an
alphabetic language a user usually has a keyboard on which there exists key
symbols corresponding to the alphabet. Sometimes, a few characters of an
alphabetic language are missing on the keyboard. Many computer users, who
speak a Latin alphabet based language only have a English-based keyboard. They
need to hit a combination of keystrokes in order to enter a character that does
not exist directly on the keyboard. A number of algorithms have been developed
for entering such characters, known as European input methods, or compose input
method, or dead-keys input method.
In some alphabetic languages, the rendering of characters strings is context
sensitive. When entering characters in those languages, a keystroke does not
systematically mean appending a new symbol at the end of the string. It may
modify the existing strings. Both input and output methods may be used in such
languages.
With an ideographic writing system, rather than taking a small set of symbols
and combining them in different ways to create words, each word consists of one
unique symbol (or, occasionally, several symbols). The number of symbols may
be very large: 150 000 have been identified in Hanzi, the Chinese ideographic
system.
There are two major aspects of ideographics system for their computer usage.
First, the standard computer character sets in Japan, China, and Korea include
roughly 8 000 characters, while sets in Taiwan have between 15 000 and 30 000
characters, which make it necessary to use more than one byte to represent a
character. Second, it obviously is impractical to have a keyboard that
includes all of a given language's ideographic symbols. Therefore a specific
mechanism is required for entering characters so that a keyboard with a reason-
able number of keys can be used. Those input methods are usually based on the
language's phonetics, but there also exist methods based on the graphics pro-
perties of characters.
In addition to the ideographic characters, a number of languages often also
include a phonetic (alphabetic-based) writing system. The phonetic signs are
then engraved on the keyboard and the keystrokes are transformed to their
appropriate ideographic counterparts. Here's a brief description of the
Japanese and Korean phonetic systems:
o Japanese: There are two phonetic symbol sets: katakana and hiragana. In gen-
eral, you use katakana for words that are of foreign origin, and hiragana
for writing native Japanese words. Collectively, the two systems are called
kana. Each set consists of approximately 50 characters. You type either
kana or English characters and define the region that you want to convert to
3
XIM Public Review Draft
kanji. Several kanji characters may have the same phonetic representation.
If that's the case with your string, you get a menu of characters and choose
the appropriate one. If no choice is necessary, the input method does the
substitution directly. When Latin characters are converted to kana or Kanji,
it is called a romaji conversion.
o Korean: Hangul is a writing system that actually straddles the line between
phonetic and ideographic. It's phonetic in the sense that each of the
roughly 25 characters represents a specific sound. But between two and five
of the characters are combined to form syllables, and these syllables are
the basic units on which text processing is done. For example, a delete
operation works on a syllable rather than the individual characters within
it. And Korean code sets include several thousands of these syllables. You
type the hangul characters that make up the syllables of the words you're
entering. The display changes as you enter each hangul letter. That is, when
you enter the first letter, it fills the entire space that the final syll-
able will take up. When you enter the second, the first shrinks to about
half its size to make room for the second. When you enter the third, the
first two shrink again. And so on, up to the maximum of five letters in a
syllable.
It's usually acceptable to keep Korean text in hangul form, but some words
are more commonly written in hanja. If you want to change hangul to hanja,
you define the region to be converted, and follow the same basic method as
described for Japanese.
Probably because there are well-accepted phonetic writing systems for Japanese
and Korean, computer input methods for those languages are fairly standard.
Keyboard keys have both English characters and the local language's phonetic
symbols engraved on them. You can then switch the keyboard from English to
local mode and vice versa.
The situation is different for Chinese. While there is a phonetic system called
Pinyin promoted by authorities, there is no consensus for entering Chinese
text. Some vendors use a phonetic decomposition (Pinyin or another), others
use ideographic decomposition of Chinese words, with various implementations
and keyboard layouts. There are about 16 known methods, none of which is a
clear standard.
Also, there are actually two ideographic sets used: Traditional Chinese, (the
original written Chinese) and Simplified Chinese. Several years back, the
People's Republic Of China launched a campaign to simplify some ideographic
characters and
没有合适的资源?快使用搜索试试~ 我知道了~
oxim-1.2.2.tar.gz_oxim linux
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 118 浏览量
2022-09-19
14:39:17
上传
评论
收藏 2.11MB GZ 举报
温馨提示
共398个文件
c:64个
plo:49个
h:33个
linux 下的 oxim 输入法,简单易用.
资源推荐
资源详情
资源评论
收起资源包目录
oxim-1.2.2.tar.gz_oxim linux (398个子文件)
oxim.1 2KB
close.xpm#1.1 612B
oxim_main.c#1.28 5KB
oxim_config.c#1.30 10KB
oxim_module.c#1.45 21KB
oximtool.h#1.45 10KB
xim.c#1.53 36KB
gui.c#1.79 27KB
ABOUT-NLS 0B
configure.ac 13KB
Makefile.am 1KB
Makefile.am 826B
Makefile.am 602B
Makefile.am 597B
Makefile.am 577B
Makefile.am 554B
Makefile.am 528B
Makefile.am 505B
Makefile.am 504B
Makefile.am 462B
Makefile.am 371B
Makefile.am 348B
Makefile.am 342B
Makefile.am 240B
Makefile.am 235B
Makefile.am 220B
Makefile.am 92B
AUTHORS 42B
s2t.c 101KB
FrameMgr.c 60KB
i18nPtHdr.c 53KB
gtkimcontextoxim.c 41KB
i18nMethod.c 37KB
gen-inp-v1.c 36KB
gen-inp.c 36KB
xim.c 36KB
i18nIc.c 33KB
gui.c 28KB
xim_IC.c 21KB
oxim_module.c 21KB
i18nIMProto.c 20KB
gui_preedit.c 18KB
i18nX.c 16KB
gencin.c 16KB
i18nClbk.c 16KB
chewing2.c 15KB
chewing.0.3.2.c 15KB
chewing.c 14KB
gui_symbol.c 13KB
gencin.c 13KB
sampleIM.c 13KB
oxim_settings.c 13KB
i18nTr.c 13KB
gui_menu.c 13KB
fkey.c 13KB
gui_keyboard.c 12KB
IC.c 10KB
gui_tray.c 10KB
oxim_config.c 10KB
gui_xcin.c 10KB
gui_fixed.c 10KB
i18nUtil.c 8KB
ogimcontext.c 8KB
unicode.c 6KB
gui_status.c 6KB
oxim_main.c 6KB
oxim2tab.c 6KB
oxim-conv.c 6KB
i18nAttr.c 6KB
oxim_ascii_wb.c 6KB
oxim_keymap.c 5KB
IMConn.c 4KB
oxim_check_file.c 4KB
IMValues.c 3KB
oxim_set_locale.c 3KB
stable_sort.c 3KB
IMMethod.c 2KB
strcmp_wild.c 2KB
oxim_qphrase.c 2KB
oxim_get_word.c 2KB
oxim_perr.c 2KB
im-oxim.c 2KB
oxim_utf8toucs4.c 2KB
oxim_get_line.c 2KB
oxim_ucs4toutf8.c 1KB
wchs_to_mbs.c 1KB
oxim_addslashes.c 1KB
oxim_charcode.c 1KB
oxim_malloc.c 1KB
ogim.c 1KB
oxim_utf8len.c 1KB
oxim_open_file.c 1KB
ChangeLog 60KB
CHANGELOG 503B
changelog 175B
ChangeLog 0B
cnscj.cin 2.46MB
cnsphone.cin 1.18MB
compat 2B
defaultkeyboard.conf 2KB
共 398 条
- 1
- 2
- 3
- 4
资源评论
APei
- 粉丝: 65
- 资源: 1万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功