Available to hire
I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI.
I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.
Skills
Language
Hindi
Advanced
Tamil
Advanced
Telugu
Advanced
Work Experience
AI/ML Engineer at Meta
August 1, 2024 - PresentLed development and deployment of LLaMA-2 and internal transcription models for real-time summarization, transcription, and Q&A over structured and unstructured data using Python, PyTorch, Hugging Face Transformers, LangChain, and internal validation frameworks. Fine-tuned BART and LLaMA with LoRA/QLoRA/PEFT on GPU clusters, reducing training time by 60% and memory usage by 65%. Built end-to-end MLOps pipelines automating data ingestion, preprocessing, model training, evaluation, and deployment, boosting release speed by 40%. Implemented retrieval-augmented generation pipelines with FAISS and internal vector search for semantic search over live document streams, increasing query accuracy by 35%. Deployed containerized microservices with internal orchestration platforms, ensuring 24/7 uptime. Exported PyTorch models to TorchScript and applied FP16/INT8 quantization, achieving up to 3× faster responses.
AI/ML Engineer at Amazon
July 1, 2023 - October 15, 2025Led development of multilingual Alexa voice assistant using BERT, RoBERTa, and PyTorch Lightning, supporting Hindi, Tamil, and Telugu and improving Q&A accuracy by 28% across large regional datasets. Fine-tuned transformer models with LoRA/PEFT/DeepSpeed on AWS SageMaker and EC2 GPU clusters, reducing training time by 35%. Designed scalable semantic retrieval pipelines using FAISS with HNSW indexing on AWS EC2 for real-time recommendations, driving a 22% uplift in CTR. Built real-time ML workflows with Apache Spark on AWS EMR, Airflow (MWAA), and TorchElastic, reducing end-to-end latency by 40%. Developed feature stores across 10M+ users, deployed low-latency inference services via TorchServe/ONNX Runtime on AWS ECS/Lambda, and contributed to Indic NLP improvements.
Education
Master's degree in applied data science at Clarkson University
January 1, 2023 - January 1, 2025Qualifications
Industry Experience
Software & Internet, Professional Services, Media & Entertainment
Skills
Hire a AI Engineer
We have the best ai engineer experts on Twine. Hire a ai engineer in California City today.