I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI. I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.

Shashanka Oruganti

I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI. I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.

Available to hire

I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI.

I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert

Language

Hindi
Advanced
Tamil
Advanced
Telugu
Advanced

Work Experience

AI/ML Engineer at Meta
August 1, 2024 - Present
Led development and deployment of LLaMA-2 and internal transcription models for real-time summarization, transcription, and Q&A over structured and unstructured data using Python, PyTorch, Hugging Face Transformers, LangChain, and internal validation frameworks. Fine-tuned BART and LLaMA with LoRA/QLoRA/PEFT on GPU clusters, reducing training time by 60% and memory usage by 65%. Built end-to-end MLOps pipelines automating data ingestion, preprocessing, model training, evaluation, and deployment, boosting release speed by 40%. Implemented retrieval-augmented generation pipelines with FAISS and internal vector search for semantic search over live document streams, increasing query accuracy by 35%. Deployed containerized microservices with internal orchestration platforms, ensuring 24/7 uptime. Exported PyTorch models to TorchScript and applied FP16/INT8 quantization, achieving up to 3× faster responses.
AI/ML Engineer at Amazon
July 1, 2023 - October 15, 2025
Led development of multilingual Alexa voice assistant using BERT, RoBERTa, and PyTorch Lightning, supporting Hindi, Tamil, and Telugu and improving Q&A accuracy by 28% across large regional datasets. Fine-tuned transformer models with LoRA/PEFT/DeepSpeed on AWS SageMaker and EC2 GPU clusters, reducing training time by 35%. Designed scalable semantic retrieval pipelines using FAISS with HNSW indexing on AWS EC2 for real-time recommendations, driving a 22% uplift in CTR. Built real-time ML workflows with Apache Spark on AWS EMR, Airflow (MWAA), and TorchElastic, reducing end-to-end latency by 40%. Developed feature stores across 10M+ users, deployed low-latency inference services via TorchServe/ONNX Runtime on AWS ECS/Lambda, and contributed to Indic NLP improvements.

Education

Master's degree in applied data science at Clarkson University
January 1, 2023 - January 1, 2025

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services, Media & Entertainment