Han unification

Differences for the same Unicode code point (U+8FD4) in regional versions of Source Han Sans

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese (chữ Hán).

Modern Chinese, Japanese and Korean typefaces typically use regional or historical variants of a given Han character. In the formulation of Unicode, an attempt was made to unify these variants by considering them as allographs – different glyphs representing the same "grapheme" or orthographic unit – hence, "Han unification", with the resulting character repertoire sometimes contracted to Unihan.[1][a]

Nevertheless, many characters have regional variants assigned to different code points, such as Traditional (U+500B) versus Simplified (U+4E2A).

  1. ^ "Unicode® Standard Annex #38 | UNICODE HAN DATABASE (UNIHAN)". Unicode Consortium. 2023-09-01.
  2. ^ "Unihan.zip". The Unicode Standard. Unicode Consortium.
  3. ^ "Unihan Database Lookup". The Unicode Standard. Unicode Consortium.
  4. ^ "Unihan Database Lookup: Sample lookup for 中". The Unicode Standard. Unicode Consortium.


Cite error: There are <ref group=lower-alpha> tags or {{efn}} templates on this page, but the references will not show without a {{reflist|group=lower-alpha}} template or {{notelist}} template (see the help page).


© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search