Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI. I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.…I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI. I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.

Shashanka Oruganti

AI Engineer, AI Developer, Data Scientist, +2





I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI. I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.…I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI. I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.

Available to hire

I am an AI/ML Engineer with a passion for turning complex data into practical AI solutions. I design, fine-tune, and deploy large language models and transformer-based systems for real-time summarization, transcription, semantic search, and conversational AI.

I thrive in building end-to-end MLOps pipelines, optimizing for low latency and scalable deployments on cloud and GPU clusters. My work emphasizes retrieval-augmented generation, multilingual NLP, model interpretability, bias detection, and responsible AI practices.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Language

Hindi

Advanced

Tamil

Advanced

Telugu

Advanced

Work Experience

AI/ML Engineer at Meta

August 1, 2024 - Present

Led development and deployment of LLaMA-2 and internal transcription models for real-time summarization, transcription, and Q&A over structured and unstructured data using Python, PyTorch, Hugging Face Transformers, LangChain, and internal validation frameworks. Fine-tuned BART and LLaMA with LoRA/QLoRA/PEFT on GPU clusters, reducing training time by 60% and memory usage by 65%. Built end-to-end MLOps pipelines automating data ingestion, preprocessing, model training, evaluation, and deployment, boosting release speed by 40%. Implemented retrieval-augmented generation pipelines with FAISS and internal vector search for semantic search over live document streams, increasing query accuracy by 35%. Deployed containerized microservices with internal orchestration platforms, ensuring 24/7 uptime. Exported PyTorch models to TorchScript and applied FP16/INT8 quantization, achieving up to 3× faster responses.

AI/ML Engineer at Amazon

July 1, 2023 - October 15, 2025

Led development of multilingual Alexa voice assistant using BERT, RoBERTa, and PyTorch Lightning, supporting Hindi, Tamil, and Telugu and improving Q&A accuracy by 28% across large regional datasets. Fine-tuned transformer models with LoRA/PEFT/DeepSpeed on AWS SageMaker and EC2 GPU clusters, reducing training time by 35%. Designed scalable semantic retrieval pipelines using FAISS with HNSW indexing on AWS EC2 for real-time recommendations, driving a 22% uplift in CTR. Built real-time ML workflows with Apache Spark on AWS EMR, Airflow (MWAA), and TorchElastic, reducing end-to-end latency by 40%. Developed feature stores across 10M+ users, deployed low-latency inference services via TorchServe/ONNX Runtime on AWS ECS/Lambda, and contributed to Indic NLP improvements.