Suffix automaton

Suffix automaton
TypeSubstring index
Invented1983
Invented byAnselm Blumer; Janet Blumer; Andrzej Ehrenfeucht; David Haussler; Ross McConnell
Time complexity in big O notation
Operation Average Worst case
Space complexity
Space

In computer science, a suffix automaton is an efficient data structure for representing the substring index of a given string which allows the storage, processing, and retrieval of compressed information about all its substrings. The suffix automaton of a string is the smallest directed acyclic graph with a dedicated initial vertex and a set of "final" vertices, such that paths from the initial vertex to final vertices represent the suffixes of the string.

In terms of automata theory, a suffix automaton is the minimal partial deterministic finite automaton that recognizes the set of suffixes of a given string . The state graph of a suffix automaton is called a directed acyclic word graph (DAWG), a term that is also sometimes used for any deterministic acyclic finite state automaton.

Suffix automata were introduced in 1983 by a group of scientists from the University of Denver and the University of Colorado Boulder. They suggested a linear time online algorithm for its construction and showed that the suffix automaton of a string having length at least two characters has at most states and at most transitions. Further works have shown a close connection between suffix automata and suffix trees, and have outlined several generalizations of suffix automata, such as compacted suffix automaton obtained by compression of nodes with a single outgoing arc.

Suffix automata provide efficient solutions to problems such as substring search and computation of the largest common substring of two and more strings.


© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search