Polar Sparsity: High-Throughput Batched LLM InferencingAccelerating large language model (LLM) inference with scalable contextual sparsity, achieving up to 2.2x speedups.