Processing Units - CPU, GPU, APU, TPU, VPU, FPGA, QPU

From
Jump to: navigation, search

Youtube search... ...Google search


Unit - Heart of AI

Central Processing Unit (CPU), Graphical Process Unit (GPU), Associative Processing Unit (APU), Tensor Processing Unit (TPU), Field Programmable Gate Array (FPGA), Vision Processing Unit (VPU), and Quantum Processing Unit (QPU)

GPU - Graphical Process Unit

Youtube search... ...Google search

APU - Associative Process Unit

Youtube search... ...Google search

TPU - Tensor Processing Unit / AI Chip

Youtube search... ...Google search

Google Tensor: the chip still has four A55s for the small cores, but it has two Arm Cortex-X1 CPUs at 2.8 GHz to handle foreground processing duties. For "medium" cores, we get two 2.25 GHz A76 CPUs. (That's A76, not the A78 everyone else is using—these A76s are the "big" CPU cores from last year.) The “Google Silicon” team gives us a tour of the Pixel 6’s Tensor SoC | Ron Amadeo - ARS Technical

FPGA - Field Programmable Gate Array

Youtube search... ...Google search

VPU - Vision Processing Unit

Youtube search... ...Google search

Neuromorphic Chip

QPU - Quantum Processing Unit

Youtube search... ...Google search

image1.png


Cerebras Wafer-Scale Engine (WSE)

The Cerebras Wafer-Scale Engine (WSE) is the largest chip ever built. It is the heart of our deep learning system. 56x larger than any other chip, the WSE delivers more compute, more memory, and more communication bandwidth. This enables AI research at previously-impossible speeds and scale.


Summit (supercomputer)

Youtube search... ...Google search

Summit or OLCF-4 is a supercomputer developed by IBM for use at Oak Ridge National Laboratory, which as of November 2018 is the fastest supercomputer in the world, capable of 200 petaflops. Its current LINPACK is clocked at 148.6 petaflops. As of November 2018, the supercomputer is also the 3rd most energy efficient in the world with a measured power efficiency of 14.668 GFlops/watt. Summit is the first supercomputer to reach exaop (exa operations per second) speed, achieving 1.88 exaops during a genomic analysis and is expected to reach 3.3 exaops using mixed precision calculations.

DESIGN: Design Each one of its 4,608 nodes (9,216 IBM POWER9 CPUs and 27,648 Nvidia Tesla GPUs) has over 600 GB of coherent memory (6×16 = 96 GB HBM2 plus 2×8×32 = 512 GB DDR4 SDRAM) which is addressable by all CPUs and GPUs plus 800 GB of non-volatile RAM that can be used as a burst buffer or as extended memory. The POWER9 CPUs and Volta GPUs are connected using Nvidia's high speed NVLink. This allows for a heterogeneous computing model.[14] To provide a high rate of data throughput, the nodes will be connected in a non-blocking fat-tree topology using a dual-rail Mellanox EDR InfiniBand interconnect for both storage and inter-process communications traffic which delivers both 200Gb/s bandwidth between nodes and in-network computing acceleration for communications frameworks such as MPI and SHMEM/PGAS. Summit (supercomputer) | Wikipedia

Photonic Integrated Circuit (PIC)

Youtube search... ...Google search

Lightmatter is creating...

  • Envise: the world's first AI Accelerator, Envise, running on photonic cores (computing via light). The company's platform unlocks improved latency performance (128^2 MVPs in a single 2.5GHz clock cycle), reduces total cost of ownership, and reduces power consumption (compared to traditional GPUs). he Envise 4S features 16 Envise Chips in a 4-U server configuration with only 3kW power consumption. 4S is a building block for a rack-scale Envise inference system that can run the largest neural networks developed to date at unprecedented performance — 3 times higher IPS than the Nvidia DGX-A100 with 8 times the IPS/W on BERT-Base SQuAD. Massive on-chip activation and weight storage enabling state-of-the-art neural network execution without leaving the processor. Standards-based host and interconnect interface. Revolutionary compute, standard communications. RISC cores per Envise processor. Generic off-load capabilities. Ultra-high performance out-of-order super-scalar processing architecture. Deployment-grade reliability, availability, and serviceability features. Next generation compute with the reliability of standard electronics. 400Gbps Lightmatter interconnect fabric per Envise chip — enabling large model scale-out. Running the most advanced neural networks on the planet.
  • IDIOM: is the company's Software Development Kit that uses common machine learning frameworks, like PyTorch & TensorFlow.
  • Passage: Further, Lightmatter has also created the world's first switchable optical interconnect platform, Passage, that unlocks optical speeds, system integration, dynamic workloads and reduced power consumption. Features: Reduce the carbon footprint and operating cost of your datacenter while powering the most advanced neural networks (and the next generation) with a fundamentally new, powerful and efficient computing platform: photonics. Photonics enables multiple operations within the same area. T