Chinese Genealogical Word List

Introduction
This list contains Chinese words with their English translations. The words included here are those that you are likely to find in genealogical sources. If the word you are looking for is not on this list, please consult a Chinese-English (漢英; hàn yīng) or English-Chinese (英漢; yīng hàn) dictionary. (Also, see the “Additional Resources” section, below.)

Chinese is a Sino-Tibetan language with the unique characteristic of having a character-based and non-phonetic writing system. Over one billion people across the globe speak Chinese in some form, with the predominant dialect being Mandarin (普通話/國語; pǔ tōng huà/guó yǔ), which is the official dialect spoken in China and Taiwan. Other dialects - including but not limited to Cantonese, Shanghainese, and Fukienese (Fujianese) - are largely mutually unintelligible to each other.

Despite significant differences in the many spoken dialects of Chinese, standard written Chinese - based off the Mandarin dialect - is universally accepted and the officially sanctioned form of written Chinese and is used throughout China, Taiwan and the Chinese diaspora for official documents, news/media, and other communications. Uniquely, a speaker of one dialect may be unable to communicate orally with the speaker of another dialect, but, assuming they are both literate, they could write to each other in standard written Chinese and fully understand each other.

Chinese is spoken in China and Taiwan - where it is considered the official language - as well as among large populations of Chinese living across the globe, particularly in Southeast Asia, but also Europe, the Americas, Africa and the Middle East. Because one of the most common Chinese genealogical records is clan genealogies (族譜/家譜; zú pǔ/jiā pǔ), Chinese genealogical records could potentially be found on any continent and any country with large Chinese populations.

Written Chinese
There are currently two forms of written Chinese characters: 1) Traditional characters (繁體字; fán tǐ zì), used officially in Taiwan, Hong Kong and Singapore (one of four official languages); 2) Simplified characters (簡體字; jiǎn tǐ zì), used officially in China. Within the Chinese diaspora across the globe, the usage of traditional versus simplified characters can vary widely. Early overseas Chinese populations from the 19th and early 20th centuries as well as those from Hong Kong and Taiwan have consistently used traditional characters, whereas emigrants from China predominantly prefer the use of simplified characters. Because simplified Chinese characters have had official sanction since 1954 (the year in which the government of the People’s Republic of China implemented simplified characters to increase literacy), the large majority of Chinese genealogical records are likely to be in traditional Chinese, as this was the standard for Chinese records comprising centuries of Chinese history up until 1954. Due to the relative newness of simplified characters, the characters in this word list are in traditional form, as this is the form most commonly encountered in genealogical records. A tool for converting traditional characters to simplified characters can be found here. Traditionally, Chinese text was written in vertical columns with characters in each column written from top to bottom and columns starting on the right side of each page and going left. Most genealogical records will have a similar layout, which means the title and cover pages for such records will be in the final pages, rather than what we normally think of as the first pages in the Western sense. In modern times the Western layout of writing characters horizontally from left to right has also been adopted to a degree, but this format is uncommon in earlier records.

Radicals
Although Chinese characters are not phonetic in nature, each character contains one or more radicals (部首; bù shǒu) that form the structure of individual Chinese characters, which can number in the tens of thousands, although an educated speaker need only learn approximately 2500 characters. The most commonly accepted table of radicals contains 214 radicals. An example of a Chinese radical chart containing these 214 radicals can be found here.

Radicals are further divided according to the number of strokes each has, with a range from 1-17 strokes (an example of radicals organized by numbers of strokes can be found here). In traditional Chinese dictionaries, characters are looked up by stroke order, starting with the primary radical. For instance, the character 中 (zhōng), which means “center,” is composed of the primary one-stroke radical丨(gǔn) and contains the secondary three-stroke radical 口 (kǒu). Another character, 好 (hǎo), meaning “good,” contains the three-stroke radical 女 (nǚ), meaning “female,” and the three-stroke radical 子 (zǐ), meaning “child.” More complex characters may contain multiple radicals. For instance, the character 簡 (jiǎn), meaning “simple,” contains the radical 竹 (zhú), under which is placed the radical 門 (mén), meaning “door,” and below that the radical 曰 (yuē), meaning “to say.” In none of these cases, however, does the pronunciation of the radicals correspond to the actual pronunciation of the character.

For someone seeking a basic understanding of Chinese writing sufficient to decipher characters identified in genealogical records, a foundation in both the stroke order and radical-based formation of characters is particularly helpful. Such is especially the case in deciphering names of ancestors from hardcopy records, digital images, microfilm and so forth that does not allow the characters to be merely copied and pasted into an online transliteration program (e.g. Google Translate, for one).

Romanization
As stated above, written Chinese is not phonetic. In other words, specific phonemes, letters or sounds typically cannot be derived from simply looking at a Chinese character. Traditionally in China, knowing how a specific character was pronounced was largely only achieved by memorization. Romanization - namely, the process of transcribing or transliterating a language into Latin script - was first applied to the Chinese language by Christian missionaries working in China during the 16th century. One of the most widely used Chinese romanization systems developed in the late 19th century is the Wade-Giles system, which was the standard of transcription for the English-speaking world for most of the 20th century. In 1956, just two years after the implementation of simplified characters, the government of the People’s Republic of China introduced the hanyu pinyin (漢語拼音hàn yǔ pīn yīn) romanization system in an additional effort to boost literacy. Pinyin later became the standard romanization for China, and more recently for Taiwan and Singapore.

Although the use of pinyin is becoming increasingly the standard for native and non-native Chinese speakers, the Wade-Giles and other romanization systems are still commonly found in history books, atlases, maps and other reference materials. Learning to differentiate the multiple systems can be helpful not only in research but also in the proper indexing of names for genealogical purposes. For instance, place names like Peking and Peiching all correspond to the characters 北京, which are now more commonly romanized in pinyin as the more familiar Beijing (běi jīng). Romanization issues can also occur when researching or documenting proper names, e.g. Chinese surnames transliterated in Wade-Giles as Hsieh (謝), Chao (趙), Kuo (郭) and Chang (張) are transliterated in pinyin as Xie, Zhao, Guo and Zhang, respectively. This is further compounded when dealing with romanization of Cantonese names, as is common practice in Hong Kong, where these same four surnames may be transliterated as Tse, Chiu, Kwok and Cheung, respectively. A basic familiarity with the various romanization systems for Chinese is a critical component of doing genealogical research for Chinese names. Lacking such knowledge, a genealogist may erroneously create duplicate records for the same individual whose name has been romanized using another system or fail to recognize a match for an ancestor whose name was romanized differently.

Because Chinese is a tonal language, romanization systems have also incorporated diacritic marks or spellings to account for each separate tone. Mandarin has four tones, which are represented by four different diacritic marks: ͞  (high), / (high rising), ˅ (low rising), and \ (falling). Here are some examples of the application of these diacritic marks in pinyin for the following words: Beijing (北京; běi jīng), China (中國; zhōng guó), husband (丈夫; zhàng fū), and so forth. When recording Chinese names from genealogical records, these diacritic marks are not necessary as they only correspond to the spoken language. Additional information regarding Mandarin tones can be found here. Tones for any of the Chinese characters found in this Glossary can be obtained by copying the characters into Google Translate.

Gender
The Chinese language is largely gender-neutral and possesses few linguistic gender markers. Unlike Romantic languages, such as Spanish, Italian, and French, nouns are not gender-specific. For instance, the feminine la familia (the family) or the masculine el libro (the book) in Spanish would be rendered in Chinese as the gender-neutral 家 (jiā) for family and 書 (shū) for book. To make specific gender denotation for a noun in Chinese, one may add either 男 (nán - male) or 女 (nǚ - female) at the beginning of the word (e.g. the word for doctor (醫生; yī shēng) could be changed to女醫生 to denote a female doctor), although the common practice generally is to use the gender-neutral form. One of the few instances where gender is denoted in Chinese is the written form of the third-person pronoun 他 (tā). Traditionally, 他was used to represent both he and she, but a relatively new character, 她, is now more commonly used for “she,” with the addition of the female character, 女 (rather than 人 (rén), for “person”), as the initial radical. This differentiation between 他and 她is only applied in written Chinese; in spoken Chinese, 他and 她are both pronounced identically as “tā.” Due to its late emergence into written Chinese, the third-person female pronoun 她is unlikely to appear in the text of historic genealogical records. One way to identify whether an individual is male or female is to look for the female radical 女 (nǚ) in the given name, but it should be noted that not all female names contain 女, and there are some male names that may also contain the 女 radical. In Chinese genealogical records, female names are often not fully recorded, but are typically recorded only as the surname followed by the character 氏 (shì), a character which can roughly be translated as “clan,” “surname,” or “maiden name.” Therefore, a record with an individual named 陳氏 (chén shì) would refer to a woman from the Chen (陳) clan or could also be translated as “Ms. Chen,” with Chen being her maiden name.

Romanization
As stated above, written Chinese is not phonetic. In other words, specific phonemes, letters or sounds typically cannot be derived from simply looking at a Chinese character. Traditionally in China, knowing how a specific character was pronounced was largely only achieved by memorization. Romanization - namely, the process of transcribing or transliterating a language into Latin script - was first applied to the Chinese language by Christian missionaries working in China during the 16th century. One of the most widely used Chinese romanization systems developed in the late 19th century is the Wade-Giles system, which was the standard of transcription for the English-speaking world for most of the 20th century. In 1956, just two years after the implementation of simplified characters, the government of the People’s Republic of China introduced the hanyu pinyin (漢語拼音hàn yǔ pīn yīn) romanization system in an additional effort to boost literacy. Pinyin later became the standard romanization for China, and more recently for Taiwan and Singapore.

Although the use of pinyin is becoming increasingly the standard for native and non-native Chinese speakers, the Wade-Giles and other romanization systems are still commonly found in history books, atlases, maps and other reference materials. Learning to differentiate the multiple systems can be helpful not only in research but also in the proper indexing of names for genealogical purposes. For instance, place names like Peking and Peiching all correspond to the characters 北京, which are now more commonly romanized in pinyin as the more familiar Beijing (běi jīng). Romanization issues can also occur when researching or documenting proper names, e.g. Chinese surnames transliterated in Wade-Giles as Hsieh (謝), Chao (趙), Kuo (郭) and Chang (張) are transliterated in pinyin as Xie, Zhao, Guo and Zhang, respectively. This is further compounded when dealing with romanization of Cantonese names, as is common practice in Hong Kong, where these same four surnames may be transliterated as Tse, Chiu, Kwok and Cheung, respectively. A basic familiarity with the various romanization systems for Chinese is a critical component of doing genealogical research for Chinese names. Lacking such knowledge, a genealogist may erroneously create duplicate records for the same individual whose name has been romanized using another system or fail to recognize a match for an ancestor whose name was romanized differently.

Because Chinese is a tonal language, romanization systems have also incorporated diacritic marks or spellings to account for each separate tone. Mandarin has four tones, which are represented by four different diacritic marks: ͞  (high), / (high rising), ˅ (low rising), and \ (falling). Here are some examples of the application of these diacritic marks in pinyin for the following words: Beijing (北京; běi jīng), China (中國; zhōng guó), husband (丈夫; zhàng fū), and so forth. When recording Chinese names from genealogical records, these diacritic marks are not necessary as they only correspond to the spoken language. Additional information regarding Mandarin tones can be found here. Tones for any of the Chinese characters found in this Glossary can be obtained by copying the characters into Google Translate.