Built RAG service using FAISS and cross-encoder rerankers; containerized PyTorch models with Triton, ONNX, and TensorRT; autoscaled Kubernetes to reduce compute spend by 23% monthly. Implemented MLflow and Feast for tracking and feature management, versioning 500GB features and automating 60% of hyperparameter sweeps, cutting time-to-deploy from 14 days to 5. Designed Airflow and Spark ETL pipelines processing 1.2TB daily with data contracts, reducing schema breakages and improving SLA adherence. Established Prometheus and Evidently dashboards, lowering model incidents and improving MTTD/MTTR. Partnered with security to implement privacy checks for NLP models, achieving 60% PII redaction accuracy and passing internal audits on first pass.

Jr. AI/ML Engineer at IBM

July 1, 2023 - October 15, 2025

Designed Spark and Airflow pipelines for recommendations (preparation, candidate generation, ranking, cold-start via embeddings), improving CTR and reducing P95 latency. Redesigned XGBoost fraud detection with calibrated probabilities, increasing AUC from 0.84 to 0.91 and reducing false positives. Tuned search relevance with a lightweight ranker, achieving sub-120ms P95. Led data quality initiatives with Great Expectations, enabling faster remediation and lowering MTTR; built governance dashboards and Java Spring Boot APIs with RBAC and audit trails for 180+ users.

AI/ML Engineer at Deloitte

August 1, 2024 - November 6, 2025

Built retrieval-augmented generation (RAG) service with FAISS and cross-encoder rerankers, improving top-3 recall and reducing average handle time across multiple teams. Containerized PyTorch models with Triton, ONNX, and TensorRT, achieving lower p95 latency while autoscaling Kubernetes to reduce monthly compute spend. Implemented MLflow and Feast for end-to-end tracking and feature management, versioning features and accelerating hyperparameter sweeps. Designed Airflow and Spark ETL pipelines processing 1.2 TB daily with data contracts, improving data quality and SLA adherence. Established Prometheus and Evidently dashboards with alerts, reducing model incidents and MTTR, and collaborated with security to implement privacy checks for NLP models resulting in higher data privacy compliance.

Jr. AI/ML Engineer at IBM

July 1, 2023 - July 1, 2023

Designed Spark and Airflow-based recommendation pipeline (candidate generation, ranking, and cold-start via embeddings) delivering improved CTR and lower latency. Redesign of XGBoost fraud detection with calibrated probabilities, improving AUC and reducing false positives. Tuned search relevance with lightweight rankers, achieving better NDCG@10 while maintaining sub-120ms P95. Established data quality program (Great Expectations) to detect drift and anomalies, reducing incidents. Built React dashboards and Java Spring Boot APIs for governance with RBAC and audit trails; delivered Java microservices with Kafka and Redis for telemetry, optimizing queries and throughput.

AI/ML Engineer at Deloitte

August 1, 2024 - November 12, 2025

Built RAG service using FAISS and cross-encoder rerankers, improving top-3 recall by 15% and reducing average handle time by 28% across seven teams. Containerized PyTorch models with Triton, ONNX, and TensorRT; cut P95 latency from 220ms to 135ms; autoscaling Kubernetes reduced compute spend by 23% monthly. Implemented MLflow and Feast for tracking and feature management, versioning 500GB features and automating 60% of hyperparameter sweeps, reducing time-to-deploy from 14 days to 5. Designed Airflow and Spark ETL pipelines processing 1.2TB daily with Great Expectations data contracts, throttling schema breakages by 60% and improving SLA adherence to 70%. Established Prometheus and Evidently dashboards and alerts, lowering model incidents from seven to two and reducing mean time to detect by 41% and recovery by 36%. Partnered with security to implement privacy checks for NLP models, achieving 60% PII redaction accuracy and passing internal audit on first pass.

Jr.AI/ML Engineer at IBM

July 1, 2023 - July 1, 2023

Designed Spark and Airflow-based recommendation pipeline covering preparation, candidate generation, ranking, and cold-start via embeddings, improving CTR by 11% and P95 latency to 150ms. Redesigned XGBoost fraud detection with calibrated probabilities and investigator queues, lifting AUC from 0.84 to 0.91 and reducing false positives by 32%; accelerated dispute resolution by 29%. Tuned search relevance with a lightweight ranker, increasing NDCG@10 by 9% while maintaining sub-120ms P95. Established Great Expectations data quality program, reducing incidents from seven to two per month and lowering MTTR by 41%. Built React dashboards and Java Spring Boot APIs for governance with RBAC; served 180+ users and optimized PostgreSQL queries. Delivered Java microservices with Kafka and Redis caching for telemetry, processing 1.8M events/day and reducing query CPU by 35%.