Hi, I'm Shivam Singh

ML Engineer & AI Researcher specializing in hardware acceleration, ML infrastructure, RAG systems, and computer vision with deep interests in Linear Algebra and Quantum Mechanics.

View My Work ⚡ Get In Touch 📧

# GPU-Accelerated ML Infrastructure
import torch
import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel
# Multi-GPU training setup
model = DistributedDataParallel(model)
optimizer = torch.optim.AdamW(model.parameters())
# Custom CUDA kernel optimization
@torch.jit.script
def optimized_attention(q, k, v):
  return F.scaled_dot_product_attention(q, k, v)
# 2.5x speedup achieved! ⚡

About Me

I'm a Machine Learning Engineer and AI Researcher currently pursuing my Master's in ML & Data Science at UC San Diego. I specialize in hardware acceleration, ML infrastructure, RAG systems, and computer vision with deep interests in Linear Algebra and Quantum Mechanics.

At the Causality Lab under Prof. Biwei Hwang, I'm developing GPU-accelerated ML frameworks that achieve 2.5x speedup through optimized CUDA kernels and multi-GPU parallelization. My work bridges the gap between theoretical ML and high-performance computing.

I've delivered production ML solutions at Parabole.ai, Dell Technologies, and Tata Communications, focusing on inference optimization, distributed systems, and vector database integration for enterprise RAG applications.

I just don't only code :D In my free time I like to play tennis, drumming, wildlife & nature photography!

2.5x GPU Speedup

35% Latency Reduction

3+ Years Experience

10+ Projects Delivered

Beyond the Code

When I'm not optimizing ML algorithms, you'll find me exploring the world, capturing moments, and staying active

🎾 Tennis Enthusiast

📸 Photography Adventures

🏕️ Camping & Hiking

🌄 Exploring San Diego

Skills & Expertise

Technologies and frameworks I use to build intelligent systems

⚡

Hardware Acceleration

CUDA NCCL Custom Kernels Multi-GPU XLA LLVM

🏗️

ML Infrastructure

MLFlow Kubernetes Docker Distributed Training Model Serving CI/CD

🔍

RAG & Vector Systems

Vector Databases Embeddings LLM Integration Retrieval Systems Semantic Search Knowledge Graphs

🔬

Theory & Research

Linear Algebra Quantum Mechanics Computer Vision Graph Theory Statistical Analysis Research Papers

Featured Projects

Innovative ML solutions that push the boundaries of what's possible

🧠

GPU-Accelerated ML Framework

UC San Diego - Causality Lab

Built high-performance ML framework with custom CUDA kernels and multi-GPU parallelization. Optimized inference pipelines achieving 2.5x speedup over CPU-based implementations.

2.5x

Speedup

Multi-GPU

Parallel

CUDA

Optimized

CUDA NCCL Custom Kernels Multi-GPU PyTorch

⚡ GitHub 📊 Benchmarks

🚀

Dynamic Resource Allocator for LLMs

UC San Diego - Personal Project

Built a dynamic resource manager using CUDA + NCCL for multi-GPU load balancing with real-time memory allocation monitoring and Vulkan API visualization.

40%

Concurrency ↑

Real-time

Monitoring

Multi-GPU

Support

CUDA NCCL Vulkan API C++ Memory Management

⚡ GitHub 📈 Benchmarks

🕵️

Fraud Detection System

UC San Diego - Kaggle Competition

Anomaly detection system using Isolation Forests & Autoencoders with RBM integration for complex transactional fraud patterns detection.

92%

Precision

15%

Recall ↑

IEEE-CIS

Dataset

Isolation Forest Autoencoders RBM GNN PyTorch

🏆 Kaggle 📊 Analysis

🎨

Computer Vision for Art Classification

Springer Publication - Networks & Systems

Published research on CNN-based art classification across historical periods. Advanced feature extraction techniques for style recognition in Baroque, Renaissance, and Impressionism paintings.

Published

Springer

8500+

Images

Research

Computer Vision CNN Feature Extraction Deep Learning Art Analysis

📄 Publication 🔬 Research

🔍

Enterprise RAG Platform

Parabole.ai - Production System

Built and optimized RAG-based platform with vector databases for enterprise knowledge retrieval. Implemented custom embeddings and semantic search with 30% performance improvement.

30%

Performance ↑

Vector

Database

Enterprise

Scale

RAG Vector DB Embeddings LLM Semantic Search

🚀 Platform 📊 Metrics

💰

Portfolio Management System

Techstars Hackathon - Winning Project

Investment portfolio optimizer using Modern Portfolio Theory and Black-Litterman model with scenario analysis and stress testing capabilities.

300+

Teams

Verbal

Mention

MPT

Algorithm

Modern Portfolio Theory Black-Litterman Financial Modeling Risk Analysis

🏆 Hackathon 📈 Demo

Experience

My journey through top-tier companies and research institutions

Jul 2024 - Present

Research Assistant

UC San Diego - Causality Lab

Developing GPU-accelerated ML frameworks with custom CUDA kernels and multi-GPU parallelization. Built high-performance inference pipelines achieving 2.5x speedup through hardware optimization and distributed computing.

Jun 2024 - Aug 2024

Software Engineer Intern

Parabole.ai

Built enterprise RAG platform with vector databases and semantic search. Optimized embedding generation and retrieval systems, achieving 30% performance improvement through custom indexing and query optimization.

Jan 2023 - Jun 2023

Software Engineer Intern

Dell Technologies

Built automated test suites reducing manual efforts by 25% and developed deployment automation with Python & Terraform, cutting deployment time by 50%.

Jun 2021 - Aug 2021

Analytics Intern

Tata Communications

Created analytics models for project evaluation in JIRA using ETL pipelines with PostgreSQL and Tableau. Built real-time data pipeline with Apache Kafka for telecom operations.

Publications & Research

Contributing to the advancement of ML and computer vision through peer-reviewed research

Springer - Lecture Notes in Networks & Systems

Classifying Artworks/Paintings using Deep Learning: A Computer Vision Approach to Art Analysis

Shivam Singh, et al.

We developed a CNN-based image classification model to predict the genre of 8,500 digital paintings, achieving 60% accuracy and surpassing previous benchmarks. The research improved feature extraction techniques for style recognition in Baroque, Renaissance, and Impressionism paintings through advanced deep learning architectures.

📄 Read Paper 🔗 Code

arXiv

Causal Copilot: An Autonomous Causal Analysis Agent

Shivam Singh, Biwei Hwang

This work presents novel CUDA kernel optimizations for distributed ML inference, achieving 2.5x speedup through multi-GPU parallelization. We introduce custom memory management strategies and stream-based asynchronous processing for production-scale model serving.

📄 Preprint ⚡ Code

Hi, I'm Shivam Singh

About Me

Beyond the Code

🎾 Tennis Enthusiast

📸 Photography Adventures

🏕️ Camping & Hiking

🌄 Exploring San Diego

Skills & Expertise

Hardware Acceleration

ML Infrastructure

RAG & Vector Systems

Theory & Research

Featured Projects

GPU-Accelerated ML Framework

Dynamic Resource Allocator for LLMs

Fraud Detection System

Computer Vision for Art Classification

Enterprise RAG Platform

Portfolio Management System

Experience

Research Assistant

Software Engineer Intern

Software Engineer Intern

Analytics Intern

Publications & Research

Classifying Artworks/Paintings using Deep Learning: A Computer Vision Approach to Art Analysis

Causal Copilot: An Autonomous Causal Analysis Agent

Let's Build Something Amazing