Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference. I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.…Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference. I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.

Anil Barla

AI Engineer, Data Scientist, Full Stack Developer, +2





Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference. I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.…Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference. I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.

Available to hire

Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference.

I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Work Experience

AI/ML Engineer at NVIDIA

January 1, 2025 - Present

Contributed to GPU-accelerated Python inference services using FastAPI for transformer-based models; developed RAG prototypes with LangChain and LlamaIndex; evaluated vector retrieval approaches with FAISS (GPU) and supported Pinecone-based external PoCs with clear tooling separation; applied LoRA/QLoRA parameter-efficient fine-tuning; supported data preprocessing/evaluation using Pandas and NumPy; explored ONNX export and TensorRT benchmarking to optimize latency and throughput; containerized services with Docker and deployed on Kubernetes; tracked experiments with MLflow and supported AWS-based deployment validation for customer demos; contributed to LLM behavior and grounding quality discussions in RAG systems.

Machine Learning Engineer at Microsoft

June 1, 2020 - December 1, 2023

Developed end-to-end ML pipelines using Python, PyTorch, and TensorFlow for enterprise applications; built supervised models (classification and regression) with feature engineering; preprocessed data and performed EDA to create stable training pipelines; integrated ML models into production with Azure Machine Learning; conducted hyperparameter tuning and A/B testing; deployed batch inference on cloud compute; communicated results to stakeholders via notebooks and visualizations; worked in Agile/Scrum teams to deliver incremental ML features.