Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

Hi, I'm Sai Kumar Nune, an AI/ML Engineer based in New Haven, CT, with over five years of experience building and deploying LLM-driven systems, multi-agent RAG architectures, and semantic search applications for healthcare, finance, and telecom. I enjoy turning complex business goals into scalable AI workflows that automate knowledge retrieval, reduce latency, and improve accuracy through careful prompt design and responsible AI practices. I specialize in LangChain and LangGraph pipelines, memory-aware agents, retrieval pipelines with FAISS, ChromaDB, Pinecone, and SQL databases; I build FastAPI-based AI microservices, implement robust MLOps with Docker and Kubernetes, and fine-tune models using Hugging Face Transformers. I thrive in cross-functional Agile teams and love delivering tangible impact through data-driven insights and thoughtful UX for AI-powered applications.…Hi, I'm Sai Kumar Nune, an AI/ML Engineer based in New Haven, CT, with over five years of experience building and deploying LLM-driven systems, multi-agent RAG architectures, and semantic search applications for healthcare, finance, and telecom. I enjoy turning complex business goals into scalable AI workflows that automate knowledge retrieval, reduce latency, and improve accuracy through careful prompt design and responsible AI practices. I specialize in LangChain and LangGraph pipelines, memory-aware agents, retrieval pipelines with FAISS, ChromaDB, Pinecone, and SQL databases; I build FastAPI-based AI microservices, implement robust MLOps with Docker and Kubernetes, and fine-tune models using Hugging Face Transformers. I thrive in cross-functional Agile teams and love delivering tangible impact through data-driven insights and thoughtful UX for AI-powered applications.

Sai Kumar Nune

AI Engineer, Data Scientist, Graphic Designer, +2





Hi, I'm Sai Kumar Nune, an AI/ML Engineer based in New Haven, CT, with over five years of experience building and deploying LLM-driven systems, multi-agent RAG architectures, and semantic search applications for healthcare, finance, and telecom. I enjoy turning complex business goals into scalable AI workflows that automate knowledge retrieval, reduce latency, and improve accuracy through careful prompt design and responsible AI practices. I specialize in LangChain and LangGraph pipelines, memory-aware agents, retrieval pipelines with FAISS, ChromaDB, Pinecone, and SQL databases; I build FastAPI-based AI microservices, implement robust MLOps with Docker and Kubernetes, and fine-tune models using Hugging Face Transformers. I thrive in cross-functional Agile teams and love delivering tangible impact through data-driven insights and thoughtful UX for AI-powered applications.…Hi, I'm Sai Kumar Nune, an AI/ML Engineer based in New Haven, CT, with over five years of experience building and deploying LLM-driven systems, multi-agent RAG architectures, and semantic search applications for healthcare, finance, and telecom. I enjoy turning complex business goals into scalable AI workflows that automate knowledge retrieval, reduce latency, and improve accuracy through careful prompt design and responsible AI practices. I specialize in LangChain and LangGraph pipelines, memory-aware agents, retrieval pipelines with FAISS, ChromaDB, Pinecone, and SQL databases; I build FastAPI-based AI microservices, implement robust MLOps with Docker and Kubernetes, and fine-tune models using Hugging Face Transformers. I thrive in cross-functional Agile teams and love delivering tangible impact through data-driven insights and thoughtful UX for AI-powered applications.

Available to hire

Hi, I’m Sai Kumar Nune, an AI/ML Engineer based in New Haven, CT, with over five years of experience building and deploying LLM-driven systems, multi-agent RAG architectures, and semantic search applications for healthcare, finance, and telecom. I enjoy turning complex business goals into scalable AI workflows that automate knowledge retrieval, reduce latency, and improve accuracy through careful prompt design and responsible AI practices.

I specialize in LangChain and LangGraph pipelines, memory-aware agents, retrieval pipelines with FAISS, ChromaDB, Pinecone, and SQL databases; I build FastAPI-based AI microservices, implement robust MLOps with Docker and Kubernetes, and fine-tune models using Hugging Face Transformers. I thrive in cross-functional Agile teams and love delivering tangible impact through data-driven insights and thoughtful UX for AI-powered applications.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Language

English

Fluent

Work Experience

AI Engineer at Responsive

July 1, 2024 - November 27, 2025

Designed and deployed enterprise-grade LangGraph-powered GPT-4 chatbots for internal automation, improving service response time by 35% through multi-agent orchestration and intelligent workflow management. Built memory-aware agents and multi-tool integrations using LangChain and React Agent framework, orchestrating multi-step reasoning workflows for real-time, personalized, context-driven user interactions. Engineered retrieval-augmented generation (RAG) pipelines with LangChain, FAISS, ChromaDB, Pinecone, and SQL databases to enhance enterprise knowledge access and deliver context-aware, factually accurate responses. Optimized token-optimized chunking pipelines for PDFs and CSVs with intelligent data preprocessing. Designed graph-based Q&A agents using Neo4j, implementing semantic search and improving real-time contextual retrieval for complex query workflows. Developed and optimized prompt engineering strategies for GPT-4 and LLaMA2, enabling domain-specific automation and improved

GenAI / ML Engineer at Wipro

March 1, 2024 - March 1, 2024

Designed and deployed Retrieval-Augmented Generation (RAG) systems using LangChain, FAISS, and ChromaDB, integrating LLMs (GPT-4, LLaMA2, T5) to support healthcare claims automation, medical code extraction, and intelligent document summarization. Reduced manual triage by 35% through optimized RAG workflows, improved metadata retrieval pipelines, and semantic search capabilities for claim fraud detection and evidence retrieval. Fine-tuned LLMs for summarization and document intelligence pipelines using prompt engineering, few-shot learning, and transfer learning. Built NLP pipelines using AWS SageMaker, BERT, T5, SBERT, spaCy, and NLTK for clinical code recommendations, NER, textual entailment, and chatbot responses. Led deployment optimization to reduce inference latency by 20% through deployment strategies on AWS Lambda, EC2, S3 with DevOps collaboration. Replicated features of internal AI chatbot — built workflows for timesheet entry and approvals with 50% faster turnaround. Engin

Associate AI Engineer at Zensar Technologies

November 1, 2021 - November 1, 2021

Created classification, regression, and anomaly detection models with XGBoost and Scikit-learn, deployed transformer-based NER and classification models on AWS/Azure. Built automated data pipelines with Apache Airflow and Spark for large-scale datasets. Applied NLP with spaCy, NLTK, and BERT for text analytics, NER, sentiment analysis, and summarization. Built executive dashboards with Power BI, Tableau, and Excel; implemented CI/CD pipelines; designed recommender systems.