Curriculum Vitae

🎓 Education

💾

Texas A&M University

📍 College Station, TX
Aug 2021 - Est. May 2025

Doctor of Philosophy in Computer Engineering

Thesis: Hardware Efficient ML System Design in Knowledge Retrieval

Advisor: Dr. Narasimha Annapareddy Reddy

University of Texas Arlington

📍 Arlington, TX
Aug 2017 - May 2021

Bachelor of Science in Electrical Engineering with Honors

Minor in Computer Science

Honors Thesis: A novel remote sensing system for in-situ measurement of subsurface soil properties

💼 Professional Experience

NVIDIA

Research Intern
May 2025 - Aug 2025

📍 Santa Clara, CA

  • Designed novel model parallelism architecture for distributed multi-node multi-GPU LLM inference
  • Developed and integrated parallelism architecture in vLLM.

NVIDIA

Research Intern
May 2024 - Aug 2024

📍 Austin, TX

  • Led research to accelerate LLM inference via activation and contextual sparsity
  • Built sparsely activated OPT and LLaMA models by training activation routers for MLP and Attention layers
  • Developed custom sparse GPU kernels achieving 1.5–3× speedup in MLP layers and up to 2.5× in Attention
  • Delivered end-to-end decoding speedups up to 2.2× across diverse batch sizes and sequence lengths

Samsung Semiconductor Inc.

Research Intern
May 2022 - Aug 2022

📍 San Jose, CA

  • Reduced CPU workload by and accelerated neural inference by 64% through data pipeline and model execution optimization
  • Filed 2 patent applications for efficient neural information retrieval.

Texas A&M University

Graduate Research Assistant
Aug 2021 - Present

📍 College Station, TX

  • Led a project that resulted in a 23% speedup in embedding processing for large-scale information retrieval
  • Developed scalable multi-vector retrieval systems for large language models
  • Published research at top-tier venues including NeurIPS and ISMM

🔬 Patents

System and Method for Embeddings Retrieval

US Patent Application US20240330193A1 • Samsung Electronics Co., Ltd.

View Patent →

System and Method for Processing Embeddings

US Patent Application US20240330290A1 • Samsung Electronics Co., Ltd.

View Patent →

Technical Skills

Systems & Performance

CUDA C/C++ Parallel Computing GPU Optimization OpenMP MPI

Machine Learning

PyTorch Deep Learning LLM Inference Model Optimization Scikit-learn

Development & Tools

Python Linux Git Docker AWS Azure

Specialized

Information Retrieval Distributed Systems HPC Sparsity Memory Systems

🏆 Honors & Awards

🎖️
Innovation Day Award 2021
📚
Dean's List 2018-2021
Academic Excellence Award 2016
🌟
Cambridge Learners Award 2015