Head/tail breaks

1024 cities that follow exactly Zipf's law, which implies that the first largest city is size 1, the second largest city is size 1/2, the third largest city is size 1/3, ... and the smallest city is size 1/1024. The left pattern is produced by head/tail breaks, while the right one by natural breaks, also known as Jenks natural breaks optimization.

Head/tail breaks is a clustering algorithm for data with a heavy-tailed distribution such as power laws and lognormal distributions. The heavy-tailed distribution can be simply referred to the scaling pattern of far more small things than large ones, or alternatively numerous smallest, a very few largest, and some in between the smallest and largest. The classification is done through dividing things into large (or called the head) and small (or called the tail) things around the arithmetic mean or average, and then recursively going on for the division process for the large things or the head until the notion of far more small things than large ones is no longer valid, or with more or less similar things left only.[1] Head/tail breaks is not just for classification, but also for visualization of big data by keeping the head, since the head is self-similar to the whole. Head/tail breaks can be applied not only to vector data such as points, lines and polygons, but also to raster data like digital elevation model (DEM).

  1. ^ Jiang, Bin (2013). "Head/Tail Breaks: A New Classification Scheme for Data with a Heavy-Tailed Distribution". The Professional Geographer. 65 (3): 482–494. arXiv:1209.2801. Bibcode:2013ProfG..65..482J. doi:10.1080/00330124.2012.700499. S2CID 119297992.

© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search