AI, term by term

AI chip and compute terms, in plain English

This glossary gives short definitions and context for the vocabulary around AI infrastructure: GPUs, accelerators, HBM, CoWoS, CUDA, vLLM, SGLang, MoE, inference, post-training, and model serving.

Hardware terms: HBM, CoWoS, interconnect, racks, accelerators, memory bandwidth, and packaging.
Software terms: CUDA, kernels, PyTorch, JAX, TensorRT-LLM, vLLM, SGLang, and serving engines.
Model terms: transformers, MoE, tokenisers, RLHF, DPO, GRPO, context windows, and KV cache.