Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am an AI/ML Engineer with 5+ years of experience designing, developing, and deploying large-scale AI/ML solutions across enterprise and cloud environments. I specialize in GPU-accelerated microservices, LLMs, multilingual ASR, NLP, RAG pipelines, and semantic search using PyTorch, Hugging Face Transformers, ONNX, TensorRT, and LoRA/adapters. I enjoy building production-ready MLOps workflows with Kubernetes, Docker, Terraform, ArgoCD, and GitHub Actions. I mentor teams, focus on responsible AI, explainability, and bias mitigation, and collaborate across stakeholders to deliver scalable AI solutions.…I am an AI/ML Engineer with 5+ years of experience designing, developing, and deploying large-scale AI/ML solutions across enterprise and cloud environments. I specialize in GPU-accelerated microservices, LLMs, multilingual ASR, NLP, RAG pipelines, and semantic search using PyTorch, Hugging Face Transformers, ONNX, TensorRT, and LoRA/adapters. I enjoy building production-ready MLOps workflows with Kubernetes, Docker, Terraform, ArgoCD, and GitHub Actions. I mentor teams, focus on responsible AI, explainability, and bias mitigation, and collaborate across stakeholders to deliver scalable AI solutions.

Chaitanya Battula

AI Engineer, AI Developer, Data Scientist, +3





I am an AI/ML Engineer with 5+ years of experience designing, developing, and deploying large-scale AI/ML solutions across enterprise and cloud environments. I specialize in GPU-accelerated microservices, LLMs, multilingual ASR, NLP, RAG pipelines, and semantic search using PyTorch, Hugging Face Transformers, ONNX, TensorRT, and LoRA/adapters. I enjoy building production-ready MLOps workflows with Kubernetes, Docker, Terraform, ArgoCD, and GitHub Actions. I mentor teams, focus on responsible AI, explainability, and bias mitigation, and collaborate across stakeholders to deliver scalable AI solutions.…I am an AI/ML Engineer with 5+ years of experience designing, developing, and deploying large-scale AI/ML solutions across enterprise and cloud environments. I specialize in GPU-accelerated microservices, LLMs, multilingual ASR, NLP, RAG pipelines, and semantic search using PyTorch, Hugging Face Transformers, ONNX, TensorRT, and LoRA/adapters. I enjoy building production-ready MLOps workflows with Kubernetes, Docker, Terraform, ArgoCD, and GitHub Actions. I mentor teams, focus on responsible AI, explainability, and bias mitigation, and collaborate across stakeholders to deliver scalable AI solutions.

Available to hire

I am an AI/ML Engineer with 5+ years of experience designing, developing, and deploying large-scale AI/ML solutions across enterprise and cloud environments. I specialize in GPU-accelerated microservices, LLMs, multilingual ASR, NLP, RAG pipelines, and semantic search using PyTorch, Hugging Face Transformers, ONNX, TensorRT, and LoRA/adapters. I enjoy building production-ready MLOps workflows with Kubernetes, Docker, Terraform, ArgoCD, and GitHub Actions. I mentor teams, focus on responsible AI, explainability, and bias mitigation, and collaborate across stakeholders to deliver scalable AI solutions.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Work Experience

AI/ML Engineer at NVIDIA

June 1, 2024 - Present

Designed and deployed GPU-accelerated microservices using Python, Triton Inference Server, TensorRT-LLM, CUDA, and FastAPI to enable scalable LLM, speech, and vision model production across enterprise and cloud environments handling >10,000 requests/sec. Optimized PyTorch and Transformers models with ONNX + TensorRT (FP16/INT8), reducing inference latency by 3× and GPU memory usage by 45%. Fine-tuned LLMs with LoRA adapters and implemented RAG pipelines with LangChain, FAISS, and Weaviate, improving knowledge retrieval accuracy by ~40% and reducing response time by 35%. Automated end-to-end MLOps workflows using Kubernetes, Docker, Terraform, ArgoCD, GitHub Actions, and CI/CD pipelines, standardizing training, scaling, and rollback across multi-cloud GPU clusters. Implemented model versioning, regression validation, and containerized rollouts, ensuring safe deployment of multiple LLMs, speech, and vision models across GPU clusters. Conducted MLPerf v4.0 benchmarks on H100/H200 GPUs, p

AI/ML Engineer at Microsoft India

November 1, 2023 - October 8, 2025

Designed and developed end-to-end computer vision pipelines for real-time driver monitoring using Python, PyTorch, Hugging Face Transformers, and transfer learning with CNN/Transformer backbones, achieving >98% accuracy in behavior detection (seatbelt, gaze, mirror checks). Built and maintained large-scale video data pipelines with Python, Apache Spark, Airflow, and Hive/Presto, enabling automated preprocessing, augmentation, and feature extraction on 50,000+ hours of driving footage. Implemented distributed training workflows on Azure GPU clusters with Docker, Kubernetes, and TorchElastic, reducing training time by 35% while maintaining state-of-the-art performance across diverse driving conditions. Optimized production-ready vision models using ONNX, TensorRT, and Azure ML, achieving sub-100ms latency for scalable, real-time inference on smartphones and edge devices across 10k+ driving tests. Applied experiment tracking, hyperparameter tuning, and ML lifecycle management with MLflow,

AI/ML Engineer at NVIDIA

June 1, 2024 - Present

Designed and deployed GPU-accelerated microservices using Python, Triton Inference Server, TensorRT-LLM, CUDA, and FastAPI; enabled scalable LLM, speech, and vision model production across enterprise and cloud environments. Optimized PyTorch and Transformers models with ONNX + TensorRT (FP16/INT8), reducing inference latency by 3× and GPU memory usage by 45%. Fine-tuned LLMs with LoRA/adapters and implemented RAG pipelines with LangChain, FAISS, and Weaviate. Automated end-to-end MLOps workflows with Kubernetes, Docker, Terraform, ArgoCD, GitHub Actions, and CI/CD pipelines. Conducted MLPerf v4.0 benchmarks on H100/H200 GPUs. Extended AI services into Omniverse and Isaac Sim pipelines, generating synthetic datasets to accelerate robotics validation. Integrated observability with AWS CloudWatch, Prometheus, Grafana, MLflow, and Kafka; collaborated on secure multi-cloud deployments.

AI/ML Engineer at Microsoft

November 1, 2023 - October 23, 2025

Designed and developed end-to-end computer vision pipelines for real-time driver monitoring; built large-scale video data pipelines with Python, Apache Spark, Airflow, and Hive/Presto; implemented distributed training on Azure GPU clusters; optimized production-ready vision models with ONNX, TensorRT, and Azure ML. Applied experiment tracking and model management with MLflow, Weights & Biases, and Optuna; developed real-time search and retrieval for video snippets using FAISS with HNSW indexing and feature stores. Implemented explainable AI and responsible ML practices; enhanced embeddings with vision transformers and multimodal fusion; conducted large-scale A/B testing to improve fairness and adoption. Implemented MLOps practices for continuous deployment and monitoring on Azure DevOps.