Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

Hello! I'm Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems. I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I've led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.…Hello! I'm Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems. I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I've led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.

Sriram Degala

Data Scientist, AI Engineer, Developer, +2





Hello! I'm Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems. I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I've led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.…Hello! I'm Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems. I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I've led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.

Available to hire

Hello! I’m Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems.

I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I’ve led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Work Experience

AI/ML Engineer at MD Anderson Cancer Center

August 1, 2024 - Present

Led development of predictive cancer risk scoring pipelines using ML/DL models and Hugging Face Transformers on multi-GPU NVIDIA A100, optimizing training workflows to boost early-stage tumor detection accuracy by 27%. Built a multi-modal triage agent with GPT-4, LangGraph, MCP, Prompt Engineering, and RAG to fuse structured and unstructured oncology data, accelerating patient triage while reducing query latency by 34%. Orchestrated LangChain multi-agent RAG with Pinecone for real-time ADR detection and gene–drug discovery, improving retrieval quality with Recall@K and reducing hallucinations. Deployed scalable inference using Kubernetes and Kubeflow with CI/CD and versioned releases, enabling safe rollbacks. Optimized LLM inference with TensorRT/ONNX and FP16/INT8 for multi-GPU clusters, cutting latency by 35% with KS-test validation. Automated longitudinal training pipelines on AWS EC2 for continuous ingestion of oncology EHRs, scheduled retraining and automated model refreshes wit

AI/ML Engineer at Capgemini

May 1, 2020 - July 1, 2023

Engineered real-time trading signal pipelines in PyTorch, optimizing model architectures and data flows to cut false signals by 31% in volatile intraday markets. Implemented rare-event forecasting with XGBoost and Scikit-learn, including advanced resampling to stabilize precision on imbalanced market datasets. Built and scaled 10+ trading inference services using Docker, Kubernetes, and Azure ML, reducing deployment time by 60% and achieving ~46% uptime in production. Developed CI/CD and experiment tracking with MLflow, GitHub Actions, and FastAPI for reproducible deployments and drift reduction. Created monitoring and explainability dashboards with Plotly/SHAP to surface feature contributions and model health for quantitative research.