I am an AI/ML Engineer with 5.6+ years of experience delivering production-grade machine learning and GenAI systems in regulated finance and healthcare. I specialize in RAG-based LLM platforms, high-throughput model serving, and cost-efficient RLHF pipelines using Azure OpenAI, vLLM, Kubernetes, and MLflow, delivering up to 24x throughput gains and 75% cost reduction. I have a strong background in deep learning, NLP, and multimodal modeling, with hands-on ownership of end-to-end ML pipelines—from data engineering through deployment and governance-ready production. I thrive in cross-functional teams and bring practical experience adapting foundation models to regulated domains and building scalable, reliable MLOps workflows.

Sachin Sravan Kumar

I am an AI/ML Engineer with 5.6+ years of experience delivering production-grade machine learning and GenAI systems in regulated finance and healthcare. I specialize in RAG-based LLM platforms, high-throughput model serving, and cost-efficient RLHF pipelines using Azure OpenAI, vLLM, Kubernetes, and MLflow, delivering up to 24x throughput gains and 75% cost reduction. I have a strong background in deep learning, NLP, and multimodal modeling, with hands-on ownership of end-to-end ML pipelines—from data engineering through deployment and governance-ready production. I thrive in cross-functional teams and bring practical experience adapting foundation models to regulated domains and building scalable, reliable MLOps workflows.

Available to hire

I am an AI/ML Engineer with 5.6+ years of experience delivering production-grade machine learning and GenAI systems in regulated finance and healthcare. I specialize in RAG-based LLM platforms, high-throughput model serving, and cost-efficient RLHF pipelines using Azure OpenAI, vLLM, Kubernetes, and MLflow, delivering up to 24x throughput gains and 75% cost reduction.

I have a strong background in deep learning, NLP, and multimodal modeling, with hands-on ownership of end-to-end ML pipelines—from data engineering through deployment and governance-ready production. I thrive in cross-functional teams and bring practical experience adapting foundation models to regulated domains and building scalable, reliable MLOps workflows.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Intermediate

Work Experience

AI /ML Engineer at BlackRock, FL
March 1, 2025 - Present
Implemented domain adaptation of foundation models for regulated financial use cases, integrating SEC, FINRA, and ESMA content via RAG, LangChain, and instruction fine-tuning to meet production-quality standards. Engineered high-throughput LLM serving using Azure OpenAI for managed model inference and vLLM for self-hosted open-source models with streaming, FlashAttention, and quantization, achieving 10× inference scaling and 24× throughput gains while cutting memory footprint by 60%. Developed and deployed RLHF pipelines using DPO and PPO combined with LoRA and QLoRA, reducing GPU hours and training cost by 75% while preserving model alignment. Enhanced the RAG pipeline by integrating FAISS and Pinecone for vector retrieval, improving retrieval relevance by 22% and reducing response latency by 30% for knowledge-based queries. Built data aggregation and preprocessing pipelines handling 8M+ structured and unstructured records using Python, Spark, Kafka, and PostgreSQL, improving traini
Machine Learning Research Engineer at GE Healthcare, Remote
April 1, 2024 - March 1, 2025
Designed and deployed 11 longitudinal survival-prediction models using LSTM architectures on a 4M-record clinical dataset, achieving sub-0.01 MSE through temporal feature engineering, sequence normalization, and bias–variance optimization across patient cohorts. Built DiabCompSepsAI, a Random Forest–based postoperative complication prediction system leveraging multivariate clinical signals, reaching 94%+ accuracy through feature selection, class-imbalance handling, and hyperparameter tuning. Developed multimodal risk-stratification pipelines combining imaging data, demographics, and structured clinical metadata to classify polyp recurrence into low, moderate, and high-risk cohorts, enabling personalized post-procedure monitoring strategies. Achieved state-of-the-art performance (92.2% accuracy) on capsule endoscopy lesion detection by implementing Vision Transformer pipelines with domain-specific augmentations and self-supervised pretraining to improve generalization on limited lab
Machine Learning at Virtual Infotech Solution
January 1, 2019 - July 1, 2022
Designed and productionized supervised and deep learning models using PyTorch and TensorFlow, applying Bayesian hyperparameter optimization and k-fold cross-validation to cut experimentation cycles by 30% while maintaining stable out-of-sample performance. Built LSTM-based forecasting models and synthetic data augmentation pipelines to mitigate data sparsity, improving prediction stability and reducing error volatility by 40% in downstream analytical reporting. Re-architected NLP pipelines using spaCy and Hugging Face Transformers, fine-tuning embeddings for entity extraction and text classification, driving a 35% lift in model accuracy across production workloads. Engineered distributed data processing and feature pipelines with Spark, Kafka, and Hadoop to support real-time and batch ML workflows, reducing end-to-end data latency by 50% and enabling near real-time inference use cases. Developed low-latency ML inference services using FastAPI and gRPC, implementing async request handli

Education

Master of Science in Computer Science at Florida International University
January 11, 2030 - May 1, 2024
Bachelor of Technology in Computer Science & Engineering at Jawaharlal Nehru Technological University Kakinada
January 11, 2030 - May 1, 2019

Qualifications

AI in Healthcare
January 11, 2030 - February 17, 2026
Biostatistics in Public Health
January 11, 2030 - February 17, 2026
Epidemiology in Public Health Practice
January 11, 2030 - February 17, 2026
Responsible Conduct of Research (RCR)
January 11, 2030 - February 17, 2026
Biomedical Human Research Course
January 11, 2030 - February 17, 2026
Good Clinical Practice (GCP)
January 11, 2030 - February 17, 2026

Industry Experience

Financial Services, Healthcare, Professional Services, Software & Internet