Disjoint-set data structure

Disjoint-set/Union-find Forest
Disjoint-set/Union-find Forest
Type	multiway tree
Invented	1964
Invented by	Bernard A. Galler and Michael J. Fischer
Operation
Time complexity in big O notation
Operation	Average
Search	O(α(n)) (amortized)
Insert	O(1)
Space complexity
Space	O(n)

In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that stores a collection of disjoint (non-overlapping) sets. Equivalently, it stores a partition of a set into disjoint subsets. It provides operations for adding new sets, merging sets (replacing them by their union), and finding a representative member of a set. The last operation makes it possible to find out efficiently if any two elements are in the same or different sets.

While there are several ways of implementing disjoint-set data structures, in practice they are often identified with a particular implementation called a disjoint-set forest. This is a specialized type of forest which performs unions and finds in near-constant amortized time. To perform a sequence of $m$ addition, union, or find operations on a disjoint-set forest with $n$ nodes requires total time $O (m α(n))$ , where $α(n)$ is the extremely slow-growing inverse Ackermann function. Disjoint-set forests do not guarantee this performance on a per-operation basis. Individual union and find operations can take longer than a constant times $α(n)$ time, but each operation causes the disjoint-set forest to adjust itself so that successive operations are faster. Disjoint-set forests are both asymptotically optimal and practically efficient.

Disjoint-set data structures play a key role in Kruskal's algorithm for finding the minimum spanning tree of a graph. The importance of minimum spanning trees means that disjoint-set data structures underlie a wide variety of algorithms. In addition, disjoint-set data structures also have applications to symbolic computation, as well as in compilers, especially for register allocation problems.

^ ^a ^b ^c ^d ^e ^f Tarjan, Robert Endre (1975). "Efficiency of a Good But Not Linear Set Union Algorithm". Journal of the ACM. 22 (2): 215–225. doi:10.1145/321879.321884. hdl:1813/5942. S2CID 11105749.

[tarjan1975-1] ^ ^a ^b ^c ^d ^e ^f Tarjan, Robert Endre (1975). "Efficiency of a Good But Not Linear Set Union Algorithm". Journal of the ACM. 22 (2): 215–225. doi:10.1145/321879.321884. hdl:1813/5942. S2CID 11105749.

[1]