Hi, I'm Siddartha Sodum, a passionate Machine Learning and AI Engineer with over three years of experience building scalable AI/ML systems. I specialize in LLM-powered applications, Retrieval-Augmented Generation pipelines, and semantic search solutions using state-of-the-art models like GPT-4, T5, and BERT. I enjoy deploying real-time, low-latency inference services and implementing robust MLOps workflows to ensure reliability and observability. I thrive on solving complex problems through reinforcement learning, vector search technology, and prompt engineering to create intelligent Q&A systems and analytics dashboards. Whether it's improving model precision, optimizing deployment pipelines, or creating actionable insights for stakeholders, I bring a technical yet approachable mindset to every project.

Siddartha Sodum

Hi, I'm Siddartha Sodum, a passionate Machine Learning and AI Engineer with over three years of experience building scalable AI/ML systems. I specialize in LLM-powered applications, Retrieval-Augmented Generation pipelines, and semantic search solutions using state-of-the-art models like GPT-4, T5, and BERT. I enjoy deploying real-time, low-latency inference services and implementing robust MLOps workflows to ensure reliability and observability. I thrive on solving complex problems through reinforcement learning, vector search technology, and prompt engineering to create intelligent Q&A systems and analytics dashboards. Whether it's improving model precision, optimizing deployment pipelines, or creating actionable insights for stakeholders, I bring a technical yet approachable mindset to every project.

Available to hire

Hi, I’m Siddartha Sodum, a passionate Machine Learning and AI Engineer with over three years of experience building scalable AI/ML systems. I specialize in LLM-powered applications, Retrieval-Augmented Generation pipelines, and semantic search solutions using state-of-the-art models like GPT-4, T5, and BERT. I enjoy deploying real-time, low-latency inference services and implementing robust MLOps workflows to ensure reliability and observability.

I thrive on solving complex problems through reinforcement learning, vector search technology, and prompt engineering to create intelligent Q&A systems and analytics dashboards. Whether it’s improving model precision, optimizing deployment pipelines, or creating actionable insights for stakeholders, I bring a technical yet approachable mindset to every project.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert

Work Experience

AI Engineer at MetLife, USA
September 1, 2024 - Present
Led the development of a GPT-4-powered lapse risk prediction engine using LangChain, RAG, FAISS, and Pinecone to reduce underwriting time by 52% and increase conversion rates by 17%. Integrated Arize AI, SHAP, and LIME into real-time model monitoring systems across 15 NLP and tabular models, improving reliability and resolving drift issues. Deployed low-latency inference services maintaining <400 ms latency at 3K+ QPS using vLLM, CUDA graphs, and Triton Server on Docker, AKS, and NGINX. Built hybrid semantic search pipelines with technologies such as ChromaDB and Azure Cognitive Search, achieving sub-second retrieval with improved precision. Implemented RLHF with multi-armed bandits and Thompson Sampling to enhance user interaction relevance and satisfaction. Overhauled MLOps pipelines with full audit traceability and safe rollback capabilities. Delivered executive dashboards that provided real-time AI insights to stakeholders, aiding policy lapse management.
Machine Learning Engineer at Pyro Holdings Pvt. Ltd, India
November 30, 2022 - July 24, 2025
Designed cloud-native RAG pipelines for intelligent Q&A, significantly reducing support ticket volumes and enhancing first-touch resolution rates. Developed LLM evaluation frameworks to conduct extensive prompt testing, bias detection, and robustness improvement, resulting in increased compliance scores. Launched containerized inference services for BERT, DistilBERT, and LLaMA2 models supporting high throughput and uptime. Applied distributed anomaly detection on large-scale cloud logs achieving high precision in early warning systems. Architected reinforcement learning-based resource allocators that reduced SLA violations by 60%. Automated CI/CD workflows for ML APIs and microservices, achieving deployment parity and faster rollback. Integrated Weaviate for hybrid semantic search benefiting internal knowledge management by improving search precision and relevance.

Education

Master of Science at University of Oklahoma, Norman, Oklahoma, USA
January 1, 2024 - December 31, 2024
B. Tech at Sathyabama University, Chennai, India
January 1, 2021 - May 31, 2021

Qualifications

Machine Learning with Python (V2)
January 11, 2030 - July 24, 2025

Industry Experience

Financial Services, Software & Internet, Government, Agriculture & Mining, Media & Entertainment