Available to hire
I’m Sai Mouleeswar Reddy, an AI/ML Engineer specializing in LLM quantization, compression, and inference optimization. I enjoy building deployment-ready AI pipelines and exploring scalable MLOps to bring cutting-edge models into production.
In my current role at TCS for the AMD client, I focus on post-training quantization, benchmarking, and exporting optimized models, collaborating across AI, hardware, and runtime teams to deliver efficient and robust LLM solutions.
Work Experience
AI/ML Engineer – LLM Quantization & Optimization at Tata Consultancy Services Ltd.
November 1, 2023 - PresentPerformed post-training LLM quantization using AWQ, GPTQ, SmoothQuant, and rotation-based methods. Worked with LLMs including Qwen2.5, Qwen1.5, DeepSeek, LLaMA, Meta LLaMA, Mistral 7B, Phi-3, Phi-4. Configured quantization parameters including weight-only 4-bit per-group schemes, group sizes, sequence lengths, and calibration datasets to optimize inference efficiency and accuracy. Ran validation and benchmarking pipelines, measuring perplexity, throughput, and model stability. Exported optimized models in Hugging Face format and runtime-compatible formats for deployment. Collaborated with AI developers, hardware teams, and runtime engineers to deliver deployment-ready, optimized LLM variants.
Education
Bachelor of Technology at VNR Vignana Jyothi Institute of Engineering and Technology
August 1, 2019 - July 1, 2023MPC at Deeksha Junior College
June 1, 2017 - March 1, 2019Qualifications
Industry Experience
Software & Internet
Hire a AI Engineer
We have the best ai engineer experts on Twine. Hire a ai engineer in Hyderabad today.