Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am an AI/ML Engineer with over 4 years of experience delivering scalable, low-latency AI solutions across cloud and edge platforms. I specialize in architecting real-time AI systems that significantly boost user engagement, accelerate decision-making, and reduce operational bottlenecks. I have strong skills in deploying and fine-tuning large language models (LLMs), RAG pipelines, and multi-agent AI frameworks to deliver measurable improvements in accuracy, speed, and adoption. I am highly adept at MLOps, building microservices, and on-device AI ensuring privacy, compliance, and production-grade reliability in high-impact environments. I enjoy solving complex problems and creating efficient AI infrastructure that drives business value in fast-paced cloud and enterprise settings.…I am an AI/ML Engineer with over 4 years of experience delivering scalable, low-latency AI solutions across cloud and edge platforms. I specialize in architecting real-time AI systems that significantly boost user engagement, accelerate decision-making, and reduce operational bottlenecks. I have strong skills in deploying and fine-tuning large language models (LLMs), RAG pipelines, and multi-agent AI frameworks to deliver measurable improvements in accuracy, speed, and adoption. I am highly adept at MLOps, building microservices, and on-device AI ensuring privacy, compliance, and production-grade reliability in high-impact environments. I enjoy solving complex problems and creating efficient AI infrastructure that drives business value in fast-paced cloud and enterprise settings.

Mani Varma

AI Engineer, AI Developer, Data Scientist, +3





I am an AI/ML Engineer with over 4 years of experience delivering scalable, low-latency AI solutions across cloud and edge platforms. I specialize in architecting real-time AI systems that significantly boost user engagement, accelerate decision-making, and reduce operational bottlenecks. I have strong skills in deploying and fine-tuning large language models (LLMs), RAG pipelines, and multi-agent AI frameworks to deliver measurable improvements in accuracy, speed, and adoption. I am highly adept at MLOps, building microservices, and on-device AI ensuring privacy, compliance, and production-grade reliability in high-impact environments. I enjoy solving complex problems and creating efficient AI infrastructure that drives business value in fast-paced cloud and enterprise settings.…I am an AI/ML Engineer with over 4 years of experience delivering scalable, low-latency AI solutions across cloud and edge platforms. I specialize in architecting real-time AI systems that significantly boost user engagement, accelerate decision-making, and reduce operational bottlenecks. I have strong skills in deploying and fine-tuning large language models (LLMs), RAG pipelines, and multi-agent AI frameworks to deliver measurable improvements in accuracy, speed, and adoption. I am highly adept at MLOps, building microservices, and on-device AI ensuring privacy, compliance, and production-grade reliability in high-impact environments. I enjoy solving complex problems and creating efficient AI infrastructure that drives business value in fast-paced cloud and enterprise settings.

Available to hire

I am an AI/ML Engineer with over 4 years of experience delivering scalable, low-latency AI solutions across cloud and edge platforms. I specialize in architecting real-time AI systems that significantly boost user engagement, accelerate decision-making, and reduce operational bottlenecks. I have strong skills in deploying and fine-tuning large language models (LLMs), RAG pipelines, and multi-agent AI frameworks to deliver measurable improvements in accuracy, speed, and adoption.

I am highly adept at MLOps, building microservices, and on-device AI ensuring privacy, compliance, and production-grade reliability in high-impact environments. I enjoy solving complex problems and creating efficient AI infrastructure that drives business value in fast-paced cloud and enterprise settings.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Work Experience

AI/ML Engineer at McKinsey & Company

September 1, 2024 - Present

Designed an AI-powered document processing platform using LLMs and RAG architecture, enabling real-time inference on over 100K documents with 40% faster retrieval. Delivered RAG pipelines with vector databases (Qdrant, Weaviate) that cut multi-modal retrieval latency by 40%, significantly improving customer experience. Developed API-driven AI/ML microservices using FastAPI and Flask, containerized with Docker, and orchestrated with Kubernetes for deployment across Azure and GCP, reducing deployment times by 30%. Implemented MLOps pipelines with MLflow, Weights & Biases, and AWS SageMaker, improving model governance and increasing accuracy by 18%. Built on-device AI solutions for Android using TensorFlow Lite, MediaPipe, and PyTorch Mobile with federated learning to ensure privacy compliance. Modernized enterprise infrastructure through Autosys migration and vulnerability management, improving release cycles by 40%.

ML Engineer at Avenir Technologies

December 31, 2022 - August 26, 2025

Automated data preprocessing pipelines using Python and PySpark, reducing data preparation time by 50% and enhancing feature quality. Built predictive machine learning models with TensorFlow and Scikit-learn, increasing accuracy by 35% and driving higher customer conversions. Architected and deployed ML services using Docker, Kubernetes, and AWS/Azure for real-time predictions with GPU-optimized vector search via Pinecone, FAISS, and Qdrant. Engineered secure cloud-native ML infrastructure with Terraform, integrating real-time ingestion pipelines using Kafka, Databricks, and GCP BigQuery, enhancing throughput by 45%. Introduced model validation and automated testing pipelines using MLflow and pytest, ensuring 99% reproducibility and faster compliance.

AI/ML Engineer at McKinsey & Company

September 1, 2024 - Present

Designed an AI-powered document processing platform using LLMs, RAG architecture, ML pipelines with vector databases and AWS services enabling real-time inference on 100K+ documents with 40% faster retrieval. Delivered RAG pipelines with vector databases (Qdrant, Weaviate), cutting multi-modal retrieval latency by 40%, improving customer experience in high-volume environments. Built API-driven AI/ML microservices (FastAPI, Flask), containerized with Docker, orchestrated via Kubernetes, and deployed across Azure & GCP, reducing deployment times by 30%. Implemented MLOps pipelines with MLflow, W&B, and SageMaker improving model governance, enabling continuous experimentation, raising model accuracy by 18%. Developed on-device AI solutions for Android using TensorFlow Lite, MediaPipe, and PyTorch Mobile achieving low-latency inference with federated learning ensuring privacy compliance. Modernized enterprise infrastructure via Autosys migration, vulnerability management, and Java upgrades

ML Engineer at Avenir Technologies

December 31, 2022 - August 26, 2025

Automated data preprocessing pipelines in Python and PySpark, cutting data prep time by 50% and improving feature quality for downstream models. Built predictive ML models in TensorFlow and Scikit-learn, increasing accuracy by 35% and driving higher customer conversions. Architected and deployed ML services with Docker, Kubernetes, and AWS/Azure, enabling real-time predictions and GPU-optimized vector search with Pinecone, FAISS, and Qdrant. Engineered secure cloud-native ML infrastructure with Terraform, integrated with real-time ingestion pipelines using Kafka, Databricks, and GCP BigQuery, improving throughput by 45%. Introduced model validation and automated testing pipelines with MLflow and pytest, ensuring 99% reproducibility and faster compliance checks.