I am an AI/ML engineer passionate about turning data into business impact. With 3+ years of experience in banking and healthcare, I’ve built fraud-detection pipelines, credit-risk scoring models, and retrieval systems using Python, PySpark, Scikit-learn, XGBoost, and LightGBM. I enjoy translating complex data into actionable insights that improve accuracy and reduce losses. Beyond modeling, I deploy solutions end-to-end with FastAPI, Docker, and AWS SageMaker, and I’ve automated CI/CD with GitHub Actions. I’ve worked with large-scale claims, EMR, and regulatory data while maintaining HIPAA/GDPR compliance and collaborating with data governance teams to protect PII.

Sai Subhang Boorlagadda

I am an AI/ML engineer passionate about turning data into business impact. With 3+ years of experience in banking and healthcare, I’ve built fraud-detection pipelines, credit-risk scoring models, and retrieval systems using Python, PySpark, Scikit-learn, XGBoost, and LightGBM. I enjoy translating complex data into actionable insights that improve accuracy and reduce losses. Beyond modeling, I deploy solutions end-to-end with FastAPI, Docker, and AWS SageMaker, and I’ve automated CI/CD with GitHub Actions. I’ve worked with large-scale claims, EMR, and regulatory data while maintaining HIPAA/GDPR compliance and collaborating with data governance teams to protect PII.

Available to hire

I am an AI/ML engineer passionate about turning data into business impact. With 3+ years of experience in banking and healthcare, I’ve built fraud-detection pipelines, credit-risk scoring models, and retrieval systems using Python, PySpark, Scikit-learn, XGBoost, and LightGBM. I enjoy translating complex data into actionable insights that improve accuracy and reduce losses.

Beyond modeling, I deploy solutions end-to-end with FastAPI, Docker, and AWS SageMaker, and I’ve automated CI/CD with GitHub Actions. I’ve worked with large-scale claims, EMR, and regulatory data while maintaining HIPAA/GDPR compliance and collaborating with data governance teams to protect PII.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

AI/ML Engineer at Citi Group
September 1, 2024 - November 6, 2025
Developed fraud detection pipelines on transaction datasets (1B+ records) using Python, PySpark, and XGBoost; reduced false positives by 14% and saved $3.2M annually in manual investigation hours. Fine-tuned LLaMA-2 and Falcon models with Hugging Face PEFT on regulatory filings and compliance data to auto-summarize suspicious activity reports; integrated through FastAPI and Docker into internal compliance. Built credit-risk scoring models with LightGBM and SHAP for explainability, improving underwriting decision accuracy by 11% and ensuring transparency for regulatory audits. Orchestrated end-to-end ML workflows in AWS SageMaker with automated pipelines through GitHub Actions CI/CD, decreasing release cycles from 3 weeks to 5 days. Designed a RAG-based retrieval system using LangChain, FAISS, Hugging Face Transformers to query 4TB of archived KYC, AML, and regulatory documents; reduced compliance query turnaround from hours to minutes. Collaborated with data governance team to enforce
Data Scientist at Sage Softtech
July 1, 2023 - July 1, 2023
Performed EDA on multi-source claims and EMR datasets; surfaced billing irregularities that recovered $1.4 million in overpayments. Influenced gradient boosting and logistic regression models using Scikit-learn and XGBoost to predict 30-day readmissions (AUC 0.82). Modeled ARIMA-based forecasting for time-stamped prescription and admission data, raising pharmacy supply planning accuracy by 15%. Implemented spaCy-based NLP pipelines to process unstructured provider notes and classify diagnosis categories, reducing manual chart review workload by 11%. Packaged and deployed predictive models through AWS SageMaker with CI/CD, reducing deployment timelines from weeks to days while maintaining regulatory audit requirements.
Junior Data Scientist at Sage Softtech
June 1, 2021 - June 1, 2021
Extracted patient admission, discharge, and claims records from large SQL databases supporting downstream reporting for over 2 million members. Created Power BI dashboards with DAX to track treatment outcomes, prescription delays, and hospital occupancy, enabling faster responses to recurring capacity issues. Applied Python (Scikit-learn) to design and validate A/B tests on patient care programs, improving adherence by 9%.
Data Scientist at SAGE SOFTTECH
July 1, 2021 - July 1, 2023
Performed exploratory data analysis (EDA) on multi-source claims and EMR datasets using Pandas, NumPy, and Matplotlib, surfacing billing irregularities that recovered $1.4 million in overpayments. Influenced gradient boosting and logistic regression models with Scikit-learn and XGBoost to predict 30-day readmissions, achieving an AUC of 0.82 and supporting hospital resource allocation decisions. Modeled ARIMA-based forecasting on time-stamped prescription and admission data to anticipate seasonal demand shifts, raising pharmacy supply planning accuracy by 15%. Implemented spaCy-based NLP pipelines to process unstructured provider notes, classifying diagnosis categories with 11% higher accuracy and trimming manual chart review workload. Packaged and deployed predictive models through AWS SageMaker with CI/CD, reducing deployment timelines from weeks to days while maintaining regulatory audit requirements.

Education

Master in Computer Science, Specialisation in Data Science at North Carolina State University
January 11, 2030 - May 1, 2025
Bachelor of Technology in Computer Science, Specialisation in AI/ML at SRM University, India
January 11, 2030 - May 1, 2023
Master of Computer Science, Specialisation in Data Science at North Carolina State University
January 11, 2030 - May 1, 2025
Bachelor of Technology in Computer Science, Specialisation in AI/ML at SRM University, India
January 11, 2030 - May 1, 2023

Qualifications

OCI – Generative AI
January 11, 2030 - November 6, 2025
Python for Data Analysis BootCamp
January 11, 2030 - November 6, 2025
Azure Databricks & Spark for Data Engineers
January 11, 2030 - November 6, 2025
OCI – Generative AI
January 11, 2030 - December 2, 2025
Python for Data Analysis BootCamp
January 11, 2030 - December 2, 2025
Azure DataBricks & Spark for Data Engineers
January 11, 2030 - December 2, 2025

Industry Experience

Financial Services, Healthcare, Software & Internet