Curriculum Vitae

Education

💾

Texas A&M University

College Station, TX
Aug 2021 – Feb 2026

Doctor of Philosophy in Computer Engineering

Thesis: Efficient LLM Inference: Sparsity, Parallelism, and Hardware-Aware System Design

Advisor: Dr. Narasimha Reddy

University of Texas Arlington

Arlington, TX
Aug 2017 - May 2021

Bachelor of Science in Electrical Engineering with Honors

Minor in Computer Science

Honors Thesis: A novel remote sensing system for in-situ measurement of subsurface soil properties

Professional Experience

NVIDIA

Senior AI and HPC Engineer
Feb 2026 – Present
  • Working on MoE inference efficiency at scale, optimizing throughput and latency for agentic workloads.

NVIDIA

Research Intern
May 2025 – Aug 2025

Santa Clara, CA

  • Designed novel model parallelism architecture for distributed multi-node multi-GPU LLM inference
  • Developed and integrated parallelism architecture in vLLM

NVIDIA

Research Intern
May 2024 – Aug 2024

Austin, TX

  • Led research to accelerate LLM inference via activation and contextual sparsity
  • Built sparsely activated OPT and LLaMA models by training activation routers for MLP and Attention layers
  • Developed custom sparse GPU kernels achieving 1.5–3× speedup in MLP layers and up to 2.5× in Attention
  • Delivered end-to-end decoding speedups up to 2.2× across diverse batch sizes and sequence lengths

Samsung Semiconductor Inc.

Research Intern
May 2022 – Aug 2022

San Jose, CA

  • Reduced CPU workload by 4× and accelerated neural inference by 64% through data pipeline and model execution optimization
  • Filed 2 patent applications for efficient neural information retrieval

Texas A&M University

Graduate Research Assistant
Aug 2021 – Feb 2026

College Station, TX

  • Led a project that resulted in a 23% speedup in embedding processing for large-scale information retrieval
  • Developed scalable multi-vector retrieval systems for large language models
  • Published research at top-tier venues including NeurIPS and ISMM

Patents

System and Method for Embeddings Retrieval

US Patent Application US20240330193A1 • Samsung Electronics Co., Ltd.

View Patent →

System and Method for Processing Embeddings

US Patent Application US20240330290A1 • Samsung Electronics Co., Ltd.

View Patent →

Technical Skills

Systems & Performance

CUDA C/C++ Parallel Computing GPU Optimization OpenMP MPI

Machine Learning

PyTorch Deep Learning LLM Inference Model Optimization Scikit-learn

Development & Tools

Python Linux Git Docker AWS Azure

Specialized

Information Retrieval Distributed Systems HPC Sparsity Memory Systems

Honors & Awards

Innovation Day Award 2021
Dean's List 2018–2021
Academic Excellence Award 2016
Cambridge Learners Award 2015