Sneha Reddy G

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

Bashkir
Advanced
Javanese
Intermediate
English
Fluent

Work Experience

AI/ML Engineer at Deloitte
August 1, 2024 - Present
Built RAG service using FAISS and cross-encoder rerankers; containerized PyTorch models with Triton, ONNX, and TensorRT; autoscaled Kubernetes to reduce compute spend by 23% monthly. Implemented MLflow and Feast for tracking and feature management, versioning 500GB features and automating 60% of hyperparameter sweeps, cutting time-to-deploy from 14 days to 5. Designed Airflow and Spark ETL pipelines processing 1.2TB daily with data contracts, reducing schema breakages and improving SLA adherence. Established Prometheus and Evidently dashboards, lowering model incidents and improving MTTD/MTTR. Partnered with security to implement privacy checks for NLP models, achieving 60% PII redaction accuracy and passing internal audits on first pass.
Jr. AI/ML Engineer at IBM
July 1, 2023 - October 15, 2025
Designed Spark and Airflow pipelines for recommendations (preparation, candidate generation, ranking, cold-start via embeddings), improving CTR and reducing P95 latency. Redesigned XGBoost fraud detection with calibrated probabilities, increasing AUC from 0.84 to 0.91 and reducing false positives. Tuned search relevance with a lightweight ranker, achieving sub-120ms P95. Led data quality initiatives with Great Expectations, enabling faster remediation and lowering MTTR; built governance dashboards and Java Spring Boot APIs with RBAC and audit trails for 180+ users.
AI/ML Engineer at Deloitte
August 1, 2024 - November 6, 2025
Built retrieval-augmented generation (RAG) service with FAISS and cross-encoder rerankers, improving top-3 recall and reducing average handle time across multiple teams. Containerized PyTorch models with Triton, ONNX, and TensorRT, achieving lower p95 latency while autoscaling Kubernetes to reduce monthly compute spend. Implemented MLflow and Feast for end-to-end tracking and feature management, versioning features and accelerating hyperparameter sweeps. Designed Airflow and Spark ETL pipelines processing 1.2 TB daily with data contracts, improving data quality and SLA adherence. Established Prometheus and Evidently dashboards with alerts, reducing model incidents and MTTR, and collaborated with security to implement privacy checks for NLP models resulting in higher data privacy compliance.
Jr. AI/ML Engineer at IBM
July 1, 2023 - July 1, 2023
Designed Spark and Airflow-based recommendation pipeline (candidate generation, ranking, and cold-start via embeddings) delivering improved CTR and lower latency. Redesign of XGBoost fraud detection with calibrated probabilities, improving AUC and reducing false positives. Tuned search relevance with lightweight rankers, achieving better NDCG@10 while maintaining sub-120ms P95. Established data quality program (Great Expectations) to detect drift and anomalies, reducing incidents. Built React dashboards and Java Spring Boot APIs for governance with RBAC and audit trails; delivered Java microservices with Kafka and Redis for telemetry, optimizing queries and throughput.
AI/ML Engineer at Deloitte
August 1, 2024 - November 12, 2025
Built RAG service using FAISS and cross-encoder rerankers, improving top-3 recall by 15% and reducing average handle time by 28% across seven teams. Containerized PyTorch models with Triton, ONNX, and TensorRT; cut P95 latency from 220ms to 135ms; autoscaling Kubernetes reduced compute spend by 23% monthly. Implemented MLflow and Feast for tracking and feature management, versioning 500GB features and automating 60% of hyperparameter sweeps, reducing time-to-deploy from 14 days to 5. Designed Airflow and Spark ETL pipelines processing 1.2TB daily with Great Expectations data contracts, throttling schema breakages by 60% and improving SLA adherence to 70%. Established Prometheus and Evidently dashboards and alerts, lowering model incidents from seven to two and reducing mean time to detect by 41% and recovery by 36%. Partnered with security to implement privacy checks for NLP models, achieving 60% PII redaction accuracy and passing internal audit on first pass.
Jr.AI/ML Engineer at IBM
July 1, 2023 - July 1, 2023
Designed Spark and Airflow-based recommendation pipeline covering preparation, candidate generation, ranking, and cold-start via embeddings, improving CTR by 11% and P95 latency to 150ms. Redesigned XGBoost fraud detection with calibrated probabilities and investigator queues, lifting AUC from 0.84 to 0.91 and reducing false positives by 32%; accelerated dispute resolution by 29%. Tuned search relevance with a lightweight ranker, increasing NDCG@10 by 9% while maintaining sub-120ms P95. Established Great Expectations data quality program, reducing incidents from seven to two per month and lowering MTTR by 41%. Built React dashboards and Java Spring Boot APIs for governance with RBAC; served 180+ users and optimized PostgreSQL queries. Delivered Java microservices with Kafka and Redis caching for telemetry, processing 1.8M events/day and reducing query CPU by 35%.

Education

Master of Science at University of North Texas
January 11, 2030 - May 1, 2025
Bachelor of Technology at Gokaraju Rangaraju Institute of Engineering and Technology
January 11, 2030 - September 1, 2020
Master of Science in Computer Science at University of North Texas
January 11, 2030 - May 1, 2025
Bachelor of Technology in Electronics & Communication Engineering at Gokaraju Rangaraju Institute of Engineering and Technology
January 11, 2030 - September 1, 2020
M.S., Computer Science at University of North Texas
January 11, 2030 - May 1, 2025
B.Tech, Electronics & Communication Engineering at Gokaraju Rangaraju Institute of Engineering and Technology
January 11, 2030 - September 1, 2020

Qualifications

AWS Certified Machine Learning – Specialty
January 11, 2030 - October 15, 2025
Google Cloud Professional Machine Learning Engineer
January 11, 2030 - October 15, 2025
Certified Kubernetes Administrator (CKA)
January 11, 2030 - October 15, 2025
NVIDIA DLI — Optimizing Inference with TensorRT
January 11, 2030 - October 15, 2025
AWS Certified Machine Learning – Specialty
January 11, 2030 - November 6, 2025
Google Cloud Professional Machine Learning Engineer
January 11, 2030 - November 6, 2025
Certified Kubernetes Administrator (CKA)
January 11, 2030 - November 6, 2025
NVIDIA DLI — Optimizing Inference with TensorRT
January 11, 2030 - November 6, 2025

Industry Experience

Software & Internet, Professional Services, Computers & Electronics