Head/tail breaks is a clustering algorithm for data with a heavy-tailed distribution such as power laws and lognormal distributions. The heavy-tailed distribution can be simply referred to the scaling pattern of far more small things than large ones, or alternatively numerous smallest, a very few largest, and some in between the smallest and largest. The classification is done through dividing things into large (or called the head) and small (or called the tail) things around the arithmetic mean or average, and then recursively going on for the division process for the large things or the head until the notion of far more small things than large ones is no longer valid, or with more or less similar things left only.[1] Head/tail breaks is not just for classification, but also for visualization of big data by keeping the head, since the head is self-similar to the whole. Head/tail breaks can be applied not only to vector data such as points, lines and polygons, but also to raster data like digital elevation model (DEM).
© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search