The IBM alignment models are a sequence of increasingly complex models used in statistical machine translation to train a translation model and an alignment model, starting with lexical translation probabilities and moving to reordering and word duplication.[1][2] They underpinned the majority of statistical machine translation systems for almost twenty years starting in the early 1990s, until neural machine translation began to dominate. These models offer principled probabilistic formulation and (mostly) tractable inference.[3]
The IBM alignment models were published in parts in 1988[4] and 1990,[5] and the entire series is published in 1993.[1] Every author of the 1993 paper subsequently went to the hedge fund Renaissance Technologies.[6]
The original work on statistical machine translation at IBM proposed five models, and a model 6 was proposed later. The sequence of the six models can be summarized as:
Model 1: lexical translation
Model 2: additional absolute alignment model
Model 3: extra fertility model
Model 4: added relative alignment model
Model 5: fixed deficiency problem.
Model 6: Model 4 combined with a HMM alignment model in a log linear way
^Brown, P.; Cocke, J.; Della Pietra, S.; Della Pietra, V.; Jelinek, F.; Mercer, R.; Roossin, P. (1988). "A Statistical Approach to Language Translation". Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics.
^Brown, Peter F.; Cocke, John; Della Pietra, Stephen A.; Della Pietra, Vincent J.; Jelinek, Fredrick; Lafferty, John D.; Mercer, Robert L.; Roossin, Paul S. (1990). "A Statistical Approach to Machine Translation". Computational Linguistics. 16 (2): 79–85.