Susav Shrestha

I am a PhD candidate in Computer Engineering at Texas A&M University, advised by Dr. Narasimha Reddy. My work bridges the gap between algorithm design and system-level efficiency, with a focus on accelerating large-scale inference.

My work focuses on making machine learning systems more efficient and accessible through sparsity, hardware-aware design, and distributed inference.


Research Interests

๐Ÿš€

Efficient & Sparse LLM Inference

Optimizing large language models for high-throughput deployment

โšก

HPC and Distributed Systems

Building scalable, distributed architectures for efficient multi-GPU and multi-node LLM serving

Experience

2025

Research Intern ยท NVIDIA

Santa Clara, CA ยท May - Aug 2025

2024

Research Intern ยท NVIDIA

Austin, TX ยท May - Aug 2024

2022

Research Intern ยท Samsung Semiconductor

San Jose, CA ยท May - Aug 2022

Recent Updates

2025

๐Ÿ“„ Polar Sparsity accepted at NeurIPS 2025