I am a results-driven Data Scientist with over 4 years of experience delivering scalable AI and machine learning solutions across healthcare, banking, and insurance industries. I specialize in designing end-to-end ML workflows including data ingestion, preprocessing, feature engineering, model development, and deployment using tools such as Python, TensorFlow, and MLflow. I am passionate about leveraging AI to solve high-impact business problems while ensuring scalability, security, and ethical model governance. My expertise spans building time-series forecasting models, NLP pipelines, and generative AI applications integrated with major cloud platforms. I have demonstrated ability to translate complex data into actionable insights that reduce risk, improve decision-making, and drive regulatory-compliant outcomes under HIPAA, IRDAI, and other financial governance standards. I thrive in agile, cross-functional teams and am committed to continuous learning and innovation in AI and data science.

Sushmitha Katta

I am a results-driven Data Scientist with over 4 years of experience delivering scalable AI and machine learning solutions across healthcare, banking, and insurance industries. I specialize in designing end-to-end ML workflows including data ingestion, preprocessing, feature engineering, model development, and deployment using tools such as Python, TensorFlow, and MLflow. I am passionate about leveraging AI to solve high-impact business problems while ensuring scalability, security, and ethical model governance. My expertise spans building time-series forecasting models, NLP pipelines, and generative AI applications integrated with major cloud platforms. I have demonstrated ability to translate complex data into actionable insights that reduce risk, improve decision-making, and drive regulatory-compliant outcomes under HIPAA, IRDAI, and other financial governance standards. I thrive in agile, cross-functional teams and am committed to continuous learning and innovation in AI and data science.

Available to hire

I am a results-driven Data Scientist with over 4 years of experience delivering scalable AI and machine learning solutions across healthcare, banking, and insurance industries. I specialize in designing end-to-end ML workflows including data ingestion, preprocessing, feature engineering, model development, and deployment using tools such as Python, TensorFlow, and MLflow. I am passionate about leveraging AI to solve high-impact business problems while ensuring scalability, security, and ethical model governance.

My expertise spans building time-series forecasting models, NLP pipelines, and generative AI applications integrated with major cloud platforms. I have demonstrated ability to translate complex data into actionable insights that reduce risk, improve decision-making, and drive regulatory-compliant outcomes under HIPAA, IRDAI, and other financial governance standards. I thrive in agile, cross-functional teams and am committed to continuous learning and innovation in AI and data science.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Work Experience

Associate Data Scientist at ChristianaCare
October 1, 2024 - Present
Designed forecasting models using XGBoost and LightGBM to predict patient eligibility in oncology trials, improving recruitment efficiency by 30% and reducing patient attrition by 18%. Developed transformer-based NLP pipelines with BERT and spaCy to extract adverse drug events from over 50,000 clinical notes, increasing annotation precision to 85% for FDA-regulated documentation. Built Power BI dashboards linked to BigQuery delivering real-time clinical trial insights, reducing reporting latency from 5 days to under 2 hours. Constructed robust ETL pipelines enabling ingestion of 500GB+ multi-source data daily with 99.9% uptime. Applied survival analysis and clustering to support protocol adjustments improving clinical outcomes. Deployed ML models using FastAPI and Docker on Azure Kubernetes Service, reducing deployment cycles by 85%. Led 15+ cross-functional projects under Agile-Scrum, increasing research delivery velocity by 27%. Established HIPAA-compliant validation frameworks maint
Data Analyst at Axis Bank
January 31, 2021 - August 21, 2025
Transformed Excel-based fraud monitoring into Power BI dashboards enabling real-time tracking of over 1 million transactions, reducing fraud detection time by 40%. Implemented segmentation using PCA and K-Means for customer behavior classification, increasing ROI by 25%. Developed borrower scoring models improving loan approval accuracy by 20% and reducing risk. Automated data pipelines accelerating compliance reporting and reduced manual ETL dependency. Created Python-SQL scripts for RBI reporting with built-in validations, cutting preparation time by 30%. Conducted A/B testing refining offer strategies, increasing pre-approved loan conversions by 15%. Delivered executive Power BI dashboards with role-based access to track SLA adherence, NPA trends, and disbursement in real time.
Data Analyst at Max Life Insurance
July 31, 2019 - August 21, 2025
Developed policy lapse prediction models using Random Forest, SVM, and XGBoost, facilitating retention initiatives that preserved $1.8M in revenue. Optimized Tableau dashboards improving report load speeds by 50% and enhancing KPI access. Applied NLP using NLTK and TF-IDF to analyze over 25,000 feedback records, identifying bottlenecks leading to claims process upgrades. Integrated CRM, underwriting, and claims data into PostgreSQL, improving underwriting cycle time by 30%. Built regression-based pricing and survival models guiding issuance of $50M+ in new policies. Improved data pipeline speed cutting overnight job times by 40%. Managed 20+ Agile projects with 88% on-time sprint delivery. Embedded IRDAI-compliant data encryption ensuring zero audit failures.

Education

Master’s in Information Systems and Technology at Wilmington University
August 1, 2021 - May 31, 2023

Qualifications

IBM AI Practitioner
January 11, 2030 - August 21, 2025
Generative AI with Large Language Models – Coursera
January 11, 2030 - August 21, 2025
Machine Learning Specialization by Andrew Ng – Coursera
January 11, 2030 - August 21, 2025
Applied Data Science with Python – University of Michigan, Coursera
January 11, 2030 - August 21, 2025
Data Engineering for Everyone – Data Camp
January 11, 2030 - August 21, 2025
Data Science and AI Foundations Tools, Methods, and Best Practices – LinkedIn Learning
January 11, 2030 - August 21, 2025

Industry Experience

Healthcare, Financial Services