Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference. I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.

Anil Barla

Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference. I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.

Available to hire

Hi, I’m Anil Barla, an AI/ML Engineer with 5 years of experience designing, building, and deploying production‑grade machine learning systems across cloud and hybrid environments. I specialize in Python-based ML services, transformer architectures, and retrieval-augmented generation (RAG) pipelines, with hands-on experience in model fine-tuning, optimization (LoRA/QLoRA), and GPU-accelerated inference.

I enjoy collaborating with cross-functional teams to deliver scalable, reliable, and responsible AI solutions for real-world business applications. I have strong MLOps capabilities, containerized deployments, and cloud experience with AWS and Azure, plus a track record of benchmarking, evaluation, and customer-facing demonstrations.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert

Work Experience

AI/ML Engineer at NVIDIA
January 1, 2025 - Present
Contributed to GPU-accelerated Python inference services using FastAPI for transformer-based models; developed RAG prototypes with LangChain and LlamaIndex; evaluated vector retrieval approaches with FAISS (GPU) and supported Pinecone-based external PoCs with clear tooling separation; applied LoRA/QLoRA parameter-efficient fine-tuning; supported data preprocessing/evaluation using Pandas and NumPy; explored ONNX export and TensorRT benchmarking to optimize latency and throughput; containerized services with Docker and deployed on Kubernetes; tracked experiments with MLflow and supported AWS-based deployment validation for customer demos; contributed to LLM behavior and grounding quality discussions in RAG systems.
Machine Learning Engineer at Microsoft
June 1, 2020 - December 1, 2023
Developed end-to-end ML pipelines using Python, PyTorch, and TensorFlow for enterprise applications; built supervised models (classification and regression) with feature engineering; preprocessed data and performed EDA to create stable training pipelines; integrated ML models into production with Azure Machine Learning; conducted hyperparameter tuning and A/B testing; deployed batch inference on cloud compute; communicated results to stakeholders via notebooks and visualizations; worked in Agile/Scrum teams to deliver incremental ML features.

Education

Master of Science in Computer Science at Montclair State University
January 11, 2030 - March 5, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services