Modified discrete cosine transform

The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries. As a result of these advantages, the MDCT is the most widely used lossy compression technique in audio data compression. It is employed in most modern audio coding standards, including MP3, Dolby Digital (AC-3), Vorbis (Ogg), Windows Media Audio (WMA), ATRAC, Cook, Advanced Audio Coding (AAC),[1] High-Definition Coding (HDC),[2] LDAC, Dolby AC-4,[3] and MPEG-H 3D Audio,[4] as well as speech coding standards such as AAC-LD (LD-MDCT),[5] G.722.1,[6] G.729.1,[7] CELT,[8] and Opus.[9][10]

The discrete cosine transform (DCT) was first proposed by Nasir Ahmed in 1972,[11] and demonstrated by Ahmed with T. Natarajan and K. R. Rao in 1974.[12] The MDCT was later proposed by John P. Princen, A.W. Johnson and Alan B. Bradley at the University of Surrey in 1987,[13] following earlier work by Princen and Bradley (1986)[14] to develop the MDCT's underlying principle of time-domain aliasing cancellation (TDAC), described below. (There also exists an analogous transform, the MDST, based on the discrete sine transform, as well as other, rarely used, forms of the MDCT based on different types of DCT or DCT/DST combinations.)

In MP3, the MDCT is not applied to the audio signal directly, but rather to the output of a 32-band polyphase quadrature filter (PQF) bank. The output of this MDCT is postprocessed by an alias reduction formula to reduce the typical aliasing of the PQF filter bank. Such a combination of a filter bank with an MDCT is called a hybrid filter bank or a subband MDCT. AAC, on the other hand, normally uses a pure MDCT; only the (rarely used) MPEG-4 AAC-SSR variant (by Sony) uses a four-band PQF bank followed by an MDCT. Similar to MP3, ATRAC uses stacked quadrature mirror filters (QMF) followed by an MDCT.

  1. ^ Luo, Fa-Long (2008). Mobile Multimedia Broadcasting Standards: Technology and Practice. Springer Science & Business Media. p. 590. ISBN 9780387782638.
  2. ^ Jones, Graham A.; Layer, David H.; Osenkowsky, Thomas G. (2013). National Association of Broadcasters Engineering Handbook: NAB Engineering Handbook. Taylor & Francis. pp. 558–9. ISBN 978-1-136-03410-7.
  3. ^ "Dolby AC-4: Audio Delivery for Next-Generation Entertainment Services" (PDF). Dolby Laboratories. June 2015. Retrieved 11 November 2019.
  4. ^ Bleidt, R. L.; Sen, D.; Niedermeier, A.; Czelhan, B.; Füg, S.; et al. (2017). "Development of the MPEG-H TV Audio System for ATSC 3.0" (PDF). IEEE Transactions on Broadcasting. 63 (1): 202–236. doi:10.1109/TBC.2017.2661258. S2CID 30821673.
  5. ^ Schnell, Markus; Schmidt, Markus; Jander, Manuel; Albert, Tobias; Geiger, Ralf; Ruoppila, Vesa; Ekstrand, Per; Bernhard, Grill (October 2008). MPEG-4 Enhanced Low Delay AAC - A New Standard for High Quality Communication (PDF). 125th AES Convention. Fraunhofer IIS. Audio Engineering Society. Retrieved 20 October 2019.
  6. ^ Lutzky, Manfred; Schuller, Gerald; Gayer, Marc; Krämer, Ulrich; Wabnik, Stefan (May 2004). A guideline to audio codec delay (PDF). 116th AES Convention. Fraunhofer IIS. Audio Engineering Society. Retrieved 24 October 2019.
  7. ^ Nagireddi, Sivannarayana (2008). VoIP Voice and Fax Signal Processing. John Wiley & Sons. p. 69. ISBN 9780470377864.
  8. ^ Presentation of the CELT codec Archived 2011-08-07 at the Wayback Machine by Timothy B. Terriberry (65 minutes of video, see also presentation slides Archived 2023-11-16 at the Wayback Machine in PDF)
  9. ^ "Opus Codec". Opus (Home page). Xiph.org Foundation. Retrieved July 31, 2012.
  10. ^ Bright, Peter (2012-09-12). "Newly standardized Opus audio codec fills every role from online chat to music". Ars Technica. Retrieved 2014-05-28.
  11. ^ Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform" (PDF). Digital Signal Processing. 1 (1): 4–5. doi:10.1016/1051-2004(91)90086-Z.
  12. ^ Ahmed, Nasir; Natarajan, T.; Rao, K. R. (January 1974), "Discrete Cosine Transform", IEEE Transactions on Computers, C-23 (1): 90–93, doi:10.1109/T-C.1974.223784, S2CID 149806273
  13. ^ Princen, John P.; Johnson, A.W.; Bradley, Alan B. (1987). "Subband/Transform coding using filter bank designs based on time domain aliasing cancellation". ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 12. pp. 2161–2164. doi:10.1109/ICASSP.1987.1169405. S2CID 58446992.
  14. ^ John P. Princen, Alan B. Bradley: Analysis/synthesis filter bank design based on time domain aliasing cancellation, IEEE Trans. Acoust. Speech Signal Processing, ASSP-34 (5), 1153–1161, 1986. Described a precursor to the MDCT using a combination of discrete cosine and sine transforms.

© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search