AI chip and compute terms, in plain English
This glossary gives short definitions and context for the vocabulary around AI infrastructure: GPUs, accelerators, HBM, CoWoS, CUDA, vLLM, SGLang, MoE, inference, post-training, and model serving.
- Hardware terms: HBM, CoWoS, interconnect, racks, accelerators, memory bandwidth, and packaging.
- Software terms: CUDA, kernels, PyTorch, JAX, TensorRT-LLM, vLLM, SGLang, and serving engines.
- Model terms: transformers, MoE, tokenisers, RLHF, DPO, GRPO, context windows, and KV cache.