AI accelerator

An AI accelerator, deep learning processor, or neural processing unit (NPU) is a class of specialized hardware accelerator^[1] or computer system^[2]^[3] designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks.^[4] They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024^[update], a typical AI integrated circuit chip contains tens of billions of MOSFET transistors.^[5]

AI accelerators are used in mobile devices, such as neural processing units (NPUs) in Apple iPhones^[6] or Huawei cellphones,^[7] and personal computers such as Apple silicon Macs, to cloud computing servers such as tensor processing units (TPU) in the Google Cloud Platform.^[8] A number of vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design.

Graphics processing units designed by companies such as Nvidia and AMD often include AI-specific hardware, and are commonly used as AI accelerators, both for training and inference.^[9]

^ "Intel unveils Movidius Compute Stick USB AI Accelerator". July 21, 2017. Archived from the original on August 11, 2017. Retrieved August 11, 2017.
^ "Inspurs unveils GX4 AI Accelerator". June 21, 2017.
^ Wiggers, Kyle (November 6, 2019) [2019], Neural Magic raises $15 million to boost AI inferencing speed on off-the-shelf processors, archived from the original on March 6, 2020, retrieved March 14, 2020
^ "Google Designing AI Processors". Google using its own AI accelerators.
^ Moss, Sebastian (March 23, 2022). "Nvidia reveals new Hopper H100 GPU, with 80 billion transistors". Data Center Dynamics. Retrieved January 30, 2024.
^ "Deploying Transformers on the Apple Neural Engine". Apple Machine Learning Research. Retrieved August 24, 2023.
^ "HUAWEI Reveals the Future of Mobile AI at IFA".
^ Jouppi, Norman P.; et al. (June 24, 2017). "In-Datacenter Performance Analysis of a Tensor Processing Unit". ACM SIGARCH Computer Architecture News. 45 (2): 1–12. arXiv:1704.04760. doi:10.1145/3140659.3080246.
^ Patel, Dylan; Nishball, Daniel; Xie, Myron (November 9, 2023). "Nvidia's New China AI Chips Circumvent US Restrictions". SemiAnalysis. Retrieved February 7, 2024.