Chapter5 Word & Phrase Coding


Section1 Simple Word Coding

In the HeInput components system, a word can be a simple or complex word.

Simple Word:A simple word has no more than 3 components.

Number Word Chars Code Similar Structure Notes
1 HanZi 11 丨乚丿丶乙 one stroke
1 HanZi 24 又王十土木米女火心 Multiple strokes
2 HanZi 32 55 旧吕回因江汉如什圣 Separated
2 HanZi 11 34 千白自万巨另去丑 Touched
2 HanZi 25 34 本来必里电束内东 Crossed
3 HanZi 23 11 24 品圆画压盼劳论烟闷 Separated
3 HanZi 21 11 24 至舌兴壮音改笑任香 Connected
3 HanZi 41 11 32 理事块君讲更吨兔 Crossed

Section2 Complex Word Coding

Complex Word:A complex Word has 4 or more components.

The complex word has two coding rules.

Coding rule 1: Divide the complex word to 3 parts, take 1 code from each part.

Word 3 Parts Coding 3 Codes
艹住灬 HanZi 33 42 54
艹句攵 HanZi 33 44 45
阝立早 HanZi 32 55 25
土木戈 HanZi 32 34 34
尸古刂 HanZi 12 31 21
阝有辶 HanZi 32 32 53
日召灬 HanZi 25 14 54

Coding rules 2:For a complex word coding, the character derivative uses its core character’s code first.

The Character derivative concept:

one of 4 basic strokes (一丨丿丶) + a core character = (forming) a character derivative.

Core Char Derivative Example
白=(丿日) 魄原绵谐魏
自=(丿目) 鼻熄
=(丿土) 造靠选
禾=(丿木) 科乘透梨诱秦
天=(一大) 凑添
广=(丶厂) 序遮俯渡糠
门=(丶冂) 阀阔润闹
主=(丶王) 集售隽

A single stroke, does not have base meaning but widely used in many words, such as strokes of code 11, 21, 41, and 51.

A core character, composed by at least 2 strokes, has its base meaning.

A character derivative, is formed by adding one basic stroke to a core character.

For the complex word coding, the character derivative uses its core character’s code first.


The character derivative is a new concept; using the core character to represent its derivatives, this simplifies HeCharacter extended table.

Complex words and character derivatives examples:

Word 3 Parts Derivative Coding 3 Codes
王白石 HanZi 15 25 35
禾宀豕 禾宀豕 HanZi 34 23 45
水水水 水水水 HanZi 53 53 53
壬口辶 HanZi 32 24 53
门亻戈 HanZi 23 42 34
开刂土 HanZi 33 21 32
亡月王 HanZi 13 23 15

Word’s 4th code (assistant code): For complex words, after 3 main codes taken, there are some components left, take one code from remains as 4th code (assistant code), next page will talk more about 4th code. However, the HeInput application always prompts the 4th code.

So a complex word has 4 codes, 3 main codes + 1 assistant code.


Section3 Character’s Weight

When coding, we always take the big character code.

For example, character 木 is composed of 十八, when coding words: 杜杏困, we use 木’s code 34, do not use十 or 八. Also, character 火 is composed of 丷人; when coding words: 灶烘灼, we use 火‘s code 54, do not use 丷 or人.

火 is bigger than 丷 and 人, it is easy, but how to compare 丷 and 人?

To compare characters, we give each character a weight; a character’s weight is a measurement of its frequency in words. Character’s weight is the inverse number of its code. For example, character 丶’s code is 51, and its weight is 15; 人’s code is 43, and its weight is 34; 口’s code is 24, and its weight is 42.

The following table lists 5 characters’ codes and weights of the fifth column.

Code Character Weight Frequency
51 15 Vast
52 25 Lots
53 35 Lot
54 45 Many
55 55 A few

Characters can use their weights to order, for example, 火’s weight is 54, 口’s weight is 42, 人’s weight is 34, so they can be ordered as 火→口→人.

A complex word has 4 codes, we have 2 coding rules for the first 3 main codes, how to get the 4th code? One consideration is using character’s weight.

For a complex word, after 3 main codes token, some components left, weight them and take the heaviest component’s code as the 4th code.

Word Components Remails Heaviest remain 4th Code
HanZi 丿丶一 丶(15) 51
HanZi 一口子丶 子(51) 15

Section4 Phrase Coding

A phrase coding is based on its words’ codes.

A phrase has at most 4 codes.

Number of words Phase Rule
2 国家(24 15 51 23) 2+2=4 Codes
3 联合国(15 43 24 15) 1+1+2=4 Codes
4 中国人民(24 24 43 12) 1+1+1+1=4 Codes
>4 和码输入法(41 35 35 43) 1+1+1+1=4 Codes

Especially for a two-word phrase, if the first word has only one code, such as 日期, 心里, 小心, then take 3 codes from the second word to form 4 codes.

A phrase has at most 4 codes, but some phrase has less than 4 codes, such as 小心: 54 55; 心里: 55 25 32; 中国人: 24 24 43.


Section5 Practice

Simple or Complex words examples:

Type Word Code Word Coding
Simple 日(日) 25 木(木) 34
Simple 白(丿日) 41 25 禾(丿木) 41 34
Simple 皇(丿日王) 41 25 15 和(丿木口) 41 34 24
Complex 碧(王白石) 15 25 35 程(禾口王) 34 24 15

Word and phrase coding practice cards.

Word-statistic

The HeInput online training application has coding practices for 3 parts words, derivative components, and phrases. Appendix 3 includes several complex words writing pages.

Online application:
Complex word and phrase coding practice:www.HeChinese.net/HeInput/Lesson04

Word-statistic

Word-statistic

10+ years old Chinese students need 1-2 hours, foreign adults need 2-3 hours, and 5-6 years old kids need 4-6 hours to finish this chapter, more practices is also needed to improve fluency.