Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am an AI/ML engineer with 3.5 years of experience designing and deploying large-scale Generative AI, RAG systems, and Conversational AI. I enjoy turning complex data into practical, scalable solutions that help enterprises automate knowledge work and improve decision making. I design end-to-end ML pipelines, fine-tune transformer models with LoRA and PEFT techniques, optimize real-time inference, and collaborate with safety and governance teams to ensure compliant, high-performance AI in production. In previous roles, I contributed to organization-wide adoption of AI tooling and delivered measurable improvements in accuracy and latency.…I am an AI/ML engineer with 3.5 years of experience designing and deploying large-scale Generative AI, RAG systems, and Conversational AI. I enjoy turning complex data into practical, scalable solutions that help enterprises automate knowledge work and improve decision making. I design end-to-end ML pipelines, fine-tune transformer models with LoRA and PEFT techniques, optimize real-time inference, and collaborate with safety and governance teams to ensure compliant, high-performance AI in production. In previous roles, I contributed to organization-wide adoption of AI tooling and delivered measurable improvements in accuracy and latency.

Raghav Konduri

AI Engineer, Data Scientist, Full Stack Developer, +2





I am an AI/ML engineer with 3.5 years of experience designing and deploying large-scale Generative AI, RAG systems, and Conversational AI. I enjoy turning complex data into practical, scalable solutions that help enterprises automate knowledge work and improve decision making. I design end-to-end ML pipelines, fine-tune transformer models with LoRA and PEFT techniques, optimize real-time inference, and collaborate with safety and governance teams to ensure compliant, high-performance AI in production. In previous roles, I contributed to organization-wide adoption of AI tooling and delivered measurable improvements in accuracy and latency.…I am an AI/ML engineer with 3.5 years of experience designing and deploying large-scale Generative AI, RAG systems, and Conversational AI. I enjoy turning complex data into practical, scalable solutions that help enterprises automate knowledge work and improve decision making. I design end-to-end ML pipelines, fine-tune transformer models with LoRA and PEFT techniques, optimize real-time inference, and collaborate with safety and governance teams to ensure compliant, high-performance AI in production. In previous roles, I contributed to organization-wide adoption of AI tooling and delivered measurable improvements in accuracy and latency.

Available to hire

I am an AI/ML engineer with 3.5 years of experience designing and deploying large-scale Generative AI, RAG systems, and Conversational AI. I enjoy turning complex data into practical, scalable solutions that help enterprises automate knowledge work and improve decision making.
I design end-to-end ML pipelines, fine-tune transformer models with LoRA and PEFT techniques, optimize real-time inference, and collaborate with safety and governance teams to ensure compliant, high-performance AI in production. In previous roles, I contributed to organization-wide adoption of AI tooling and delivered measurable improvements in accuracy and latency.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Work Experience

AI/ML Engineer at Scale AI

January 1, 2025 - Present

Led design and deployment of a Retrieval-Augmented Generation (RAG) platform enabling enterprise users to ask natural language questions and receive context-aware answers from internal documentation, knowledge bases, and policy repositories. Built end-to-end pipelines for document ingestion, text embedding, and semantic retrieval using FAISS and Scale Atlas, supporting millions of documents and real-time query responses. Fine-tuned LLaMA-2, Falcon, and BART models using LoRA, QLoRA, and PEFT on distributed GPU clusters, reducing training cycles by ~60% and improving recall for enterprise queries. Integrated Scale Nucleus and GenAI APIs to collect human feedback and build continuous improvement loops for prompt evaluation, response scoring, and annotation-based fine-tuning. Optimized inference speed with TorchScript, ONNX Runtime, and mixed precision (FP16/INT8) quantization, enabling low-latency responses under heavy user load. Collaborated with internal safety and compliance teams to

Machine Learning Engineer at Accenture

February 1, 2021 - July 1, 2023

Helped develop AI reference kits to accelerate internal developer adoption, enabling rapid prototyping of ML pipelines. Supported deployment of AI-driven personalization and recommendation engines, improving end-user engagement metrics across media streaming and AdTech modules. Analyzed real-time transaction and media data pipelines to optimize ML inference latency, reducing system processing time and improving scalability for client-facing applications. Developed and containerized ML pipelines using Python, TensorFlow, PyTorch, and scikit-learn, integrating models into AWS-based microservices for production deployment. Supported real-time data ingestion and processing with Apache Kafka, Spark Streaming, and FFmpeg for AI personalization layers. Assisted in implementing cloud-based model serving and orchestration on AWS SageMaker, EC2, S3, and Lambda, ensuring high availability and low latency.