Builds and ships agentic, retrieval-augmented multi-agent LLM systems that integrate LangGraph/LangChain orchestration, vector search (Pinecone/FAISS), and robust validation and fallback mechanisms to enable accurate, low-latency QA and automation for finance and enterprise use cases. Delivers production-grade AI services with strong MLOps and systems engineering: FastAPI microservices, Dockerized deployments, ONNX-optimized inference, Kafka streaming, AWS S3 and MongoDB storage, and targeted latency optimizations that reduced production response times by up to 2.2×. Applies deep learning across computer vision, reinforcement learning, and multimodal models — from YOLO-based detection and OCR pipelines (license plates, solar defects) to TD3-based autonomous navigation and multimodal emotion recognition combining DistilBERT, DeepFace, and PANNs. Leads small teams and cross-functional projects to translate research-quality models into stakeholder-facing products; holds a B.Tech in Computer Science with Data Science specialization and Cybersecurity honors (CGPA 8.3/10), and consistently demonstrates measurable impact (e.g., 97% mAP, 92% emotion accuracy, 70% reduction in manual BOQ extraction).

Sahil Garg

Builds and ships agentic, retrieval-augmented multi-agent LLM systems that integrate LangGraph/LangChain orchestration, vector search (Pinecone/FAISS), and robust validation and fallback mechanisms to enable accurate, low-latency QA and automation for finance and enterprise use cases. Delivers production-grade AI services with strong MLOps and systems engineering: FastAPI microservices, Dockerized deployments, ONNX-optimized inference, Kafka streaming, AWS S3 and MongoDB storage, and targeted latency optimizations that reduced production response times by up to 2.2×. Applies deep learning across computer vision, reinforcement learning, and multimodal models — from YOLO-based detection and OCR pipelines (license plates, solar defects) to TD3-based autonomous navigation and multimodal emotion recognition combining DistilBERT, DeepFace, and PANNs. Leads small teams and cross-functional projects to translate research-quality models into stakeholder-facing products; holds a B.Tech in Computer Science with Data Science specialization and Cybersecurity honors (CGPA 8.3/10), and consistently demonstrates measurable impact (e.g., 97% mAP, 92% emotion accuracy, 70% reduction in manual BOQ extraction).

Available to hire

Builds and ships agentic, retrieval-augmented multi-agent LLM systems that integrate LangGraph/LangChain orchestration, vector search (Pinecone/FAISS), and robust validation and fallback mechanisms to enable accurate, low-latency QA and automation for finance and enterprise use cases.

Delivers production-grade AI services with strong MLOps and systems engineering: FastAPI microservices, Dockerized deployments, ONNX-optimized inference, Kafka streaming, AWS S3 and MongoDB storage, and targeted latency optimizations that reduced production response times by up to 2.2×.

Applies deep learning across computer vision, reinforcement learning, and multimodal models — from YOLO-based detection and OCR pipelines (license plates, solar defects) to TD3-based autonomous navigation and multimodal emotion recognition combining DistilBERT, DeepFace, and PANNs.

Leads small teams and cross-functional projects to translate research-quality models into stakeholder-facing products; holds a B.Tech in Computer Science with Data Science specialization and Cybersecurity honors (CGPA 8.3/10), and consistently demonstrates measurable impact (e.g., 97% mAP, 92% emotion accuracy, 70% reduction in manual BOQ extraction).

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

AI Engineer Intern at Munshot Private Limited
August 1, 2025 - Present
Designed scalable agentic AI framework with Gemini LLMs, 3+ tools, Pinecone vector DB for RAG QA. Led a team of 2 interns to optimize LLM performance, reduced processing time from 30s to 26s via prompt engineering, agent pre-compilation, LangChain multi-agent workflows. Applied Pydantic validation and fallback strategies using web scraper tools like Cloudscraper, HTTPX, and Firecrawl.
AI/ML Engineer Intern at Point9 AI Private Limited
February 1, 2025 - July 1, 2025
Developed predictive maintenance ML pipeline with PyTorch LSTM autoencoder, Isolation Forest unsupervised anomaly detection, XGBoost risk modeling, feature engineering, and model evaluation on multivariate time-series data. Built solar defect detection system using YOLOv11 and EfficientNet-B0 with transfer learning and NMS optimization; led Gemini LLM-powered financial statement generation agent with OpenPyXL, LangGraph workflows, generator–validator loops, Docker deployment, and UDF refinement from human feedback; architected FAISS-based BOQ extraction with AWS S3/MongoDB storage; engineered health insurance claim agent with LangGraph ReAct.

Education

Bachelor of Technology in Computer Science and Engineering at Ajay Kumar Garg Engineering College
January 1, 2022 - January 1, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services, Financial Services