Hapax legomenon

Rank-frequency plot for words in the novel Moby-Dick. About 44% of the distinct set of words in this novel, such as "matrimonial", occur only once, and so are hapax legomena (red). About 17%, such as "dexterity", appear twice (so-called dis legomena, in blue). Zipf's law predicts that the words in this plot should approximate a straight line with slope -1.

In corpus linguistics, a hapax legomenon (/ˈhæpəks lɪˈɡɒmɪnɒn/ also /ˈhæpæks/ or /ˈhpæks/;[1][2] pl. hapax legomena; sometimes abbreviated to hapax, plural hapaxes) is a word or an expression that occurs only once within a context: either in the written record of an entire language, in the works of an author, or in a single text. The term is sometimes incorrectly used to describe a word that occurs in just one of an author's works but more than once in that particular work. Hapax legomenon is a transliteration of Greek ἅπαξ λεγόμενον, meaning "said once".[3]

The related terms dis legomenon, tris legomenon, and tetrakis legomenon respectively (/ˈdɪs/, /ˈtrɪs/, /ˈtɛtrəkɪs/) refer to double, triple, or quadruple occurrences, but are far less commonly used.

Hapax legomena are quite common, as predicted by Zipf's law,[4] which states that the frequency of any word in a corpus is inversely proportional to its rank in the frequency table. For large corpora, about 40% to 60% of the words are hapax legomena, and another 10% to 15% are dis legomena.[5] Thus, in the Brown Corpus of American English, about half of the 50,000 distinct words are hapax legomena within that corpus.[6]

Hapax legomenon refers to the appearance of a word or an expression in a body of text, not to either its origin or its prevalence in speech. It thus differs from a nonce word, which may never be recorded, may find currency and may be widely recorded, or may appear several times in the work which coins it, and so on.

  1. ^ "hapax legomenon". Oxford English Dictionary (Online ed.). Oxford University Press. (Subscription or participating institution membership required.)
  2. ^ "hapax legomenon". Dictionary.com Unabridged (Online). n.d.
  3. ^ ἅπαξ. Liddell, Henry George; Scott, Robert; A Greek–English Lexicon at the Perseus Project
  4. ^ Paul Baker, Andrew Hardie, and Tony McEnery, A Glossary of Corpus Linguistics, Edinburgh University Press, 2006, page 81, ISBN 0-7486-2018-4.
  5. ^ András Kornai, Mathematical Linguistics, Springer, 2008, page 72, ISBN 1-84628-985-8.
  6. ^ Kirsten Malmkjær, The Linguistics Encyclopedia Archived 2020-01-01 at the Wayback Machine, 2nd ed, Routledge, 2002, ISBN 0-415-22210-9, p. 87.

© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search