Hello! I'm Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems. I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I've led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.

Sriram Degala

Hello! I'm Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems. I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I've led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.

Available to hire

Hello! I’m Sriram Degala, an AI/ML Engineer based in Dallas, USA. I have 4+ years of experience delivering production-grade AI solutions in healthcare and finance, with deep expertise in NLP, Generative AI, predictive analytics, and scalable ML systems.

I enjoy building end-to-end ML pipelines, LLMs, RAG, prompt engineering, and deploying models to accelerate time-to-market. I’ve led multi-modal AI initiatives, built explainable dashboards for researchers, and driven continuous retraining and safe rollbacks in cloud environments.

See more

Work Experience

AI/ML Engineer at MD Anderson Cancer Center
August 1, 2024 - Present
Led development of predictive cancer risk scoring pipelines using ML/DL models and Hugging Face Transformers on multi-GPU NVIDIA A100, optimizing training workflows to boost early-stage tumor detection accuracy by 27%. Built a multi-modal triage agent with GPT-4, LangGraph, MCP, Prompt Engineering, and RAG to fuse structured and unstructured oncology data, accelerating patient triage while reducing query latency by 34%. Orchestrated LangChain multi-agent RAG with Pinecone for real-time ADR detection and gene–drug discovery, improving retrieval quality with Recall@K and reducing hallucinations. Deployed scalable inference using Kubernetes and Kubeflow with CI/CD and versioned releases, enabling safe rollbacks. Optimized LLM inference with TensorRT/ONNX and FP16/INT8 for multi-GPU clusters, cutting latency by 35% with KS-test validation. Automated longitudinal training pipelines on AWS EC2 for continuous ingestion of oncology EHRs, scheduled retraining and automated model refreshes wit
AI/ML Engineer at Capgemini
May 1, 2020 - July 1, 2023
Engineered real-time trading signal pipelines in PyTorch, optimizing model architectures and data flows to cut false signals by 31% in volatile intraday markets. Implemented rare-event forecasting with XGBoost and Scikit-learn, including advanced resampling to stabilize precision on imbalanced market datasets. Built and scaled 10+ trading inference services using Docker, Kubernetes, and Azure ML, reducing deployment time by 60% and achieving ~46% uptime in production. Developed CI/CD and experiment tracking with MLflow, GitHub Actions, and FastAPI for reproducible deployments and drift reduction. Created monitoring and explainability dashboards with Plotly/SHAP to surface feature contributions and model health for quantitative research.

Education

Master of Science in Machine Learning at Stevens Institute of Technology
January 11, 2030 - February 5, 2026
Bachelor of Technology in Computers Science and Engineering at Jaypee Institute of Information Technology
January 11, 2030 - February 5, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Healthcare, Financial Services, Professional Services