I am an AI/ML Engineer with 4+ years of experience building scalable machine learning pipelines and distributed data systems in healthcare and enterprise environments. I am proficient in Python, Spark, SQL, and AWS, with hands-on experience in feature engineering, model training, evaluation, and batch deployment using MLflow. I have a proven track record of improving model performance, data quality, and pipeline efficiency, collaborating with clinical, actuarial, and product stakeholders to translate business needs into measurable ML problems aligned with KPIs such as risk score accuracy and cost variance. I enjoy translating complex data into actionable insights and delivering reliable, scalable data solutions.

Yashwanth Reddy

I am an AI/ML Engineer with 4+ years of experience building scalable machine learning pipelines and distributed data systems in healthcare and enterprise environments. I am proficient in Python, Spark, SQL, and AWS, with hands-on experience in feature engineering, model training, evaluation, and batch deployment using MLflow. I have a proven track record of improving model performance, data quality, and pipeline efficiency, collaborating with clinical, actuarial, and product stakeholders to translate business needs into measurable ML problems aligned with KPIs such as risk score accuracy and cost variance. I enjoy translating complex data into actionable insights and delivering reliable, scalable data solutions.

Available to hire

I am an AI/ML Engineer with 4+ years of experience building scalable machine learning pipelines and distributed data systems in healthcare and enterprise environments. I am proficient in Python, Spark, SQL, and AWS, with hands-on experience in feature engineering, model training, evaluation, and batch deployment using MLflow.
I have a proven track record of improving model performance, data quality, and pipeline efficiency, collaborating with clinical, actuarial, and product stakeholders to translate business needs into measurable ML problems aligned with KPIs such as risk score accuracy and cost variance. I enjoy translating complex data into actionable insights and delivering reliable, scalable data solutions.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Work Experience

AI/ML Engineer - Data Engineering at Humana
April 1, 2025 - Present
Designed Spark-based data pipelines on AWS processing 8–12M healthcare claims and eligibility records weekly, enabling structured feature datasets for risk stratification and cost prediction models. Engineered PySpark feature transformation workflows with schema validation and automated data quality checks, reducing missing/inconsistent records by 28% and improving stability of downstream clinical and financial ML models. Built and evaluated classification and regression models using Scikit-learn and XGBoost to support use cases such as readmission risk scoring and utilization forecasting, improving baseline ROC-AUC and error metrics by 14–18% through feature selection and hyperparameter tuning. Implemented MLflow for experiment tracking and model versioning, ensuring reproducible training cycles and controlled deployment across analytics and actuarial teams. Developed batch inference pipelines integrated with Amazon Redshift to deliver model outputs into reporting layers used by o
Data Engineer / Machine Learning Engineer at Mphasis
December 1, 2020 - December 1, 2022
Built scalable ETL pipelines using Python, SQL, and Apache Spark processing 5–7M financial and operational records daily to support analytics dashboards and predictive modeling initiatives. Designed data validation and reconciliation frameworks that reduced missing and inconsistent records by 35%, improving reliability of training datasets and executive reporting. Prepared structured feature datasets and supported model development using Scikit-learn for forecasting and classification use cases, contributing to 10–15% performance improvement over baseline models. Implemented dimensional data modeling (star schema) to optimize query performance, improving dashboard and ML data retrieval time by 27%. Tuned AWS S3 and Redshift workloads through partitioning and query optimization, reducing average pipeline execution time by 22% while maintaining data accuracy. Contributed to Agile sprint cycles through code reviews, documentation, and iterative data pipeline enhancements aligned with
Data Analyst / Junior Data Engineer at Intex Technologies
April 1, 2019 - November 1, 2020
Analyzed sales and operational datasets using SQL and Python to support KPI reporting for cross-functional teams, improving data transparency for planning and inventory forecasting. Automated data cleaning and transformation workflows, reducing manual reporting effort by 30% and improving consistency across recurring reports. Conducted exploratory data analysis to identify seasonal trends, anomalies, and performance gaps, supporting data-driven decisions across sales and operations. Assisted in preparing structured datasets for predictive modeling initiatives, improving data readiness for downstream analytics and machine learning efforts. Developed Tableau and Power BI dashboards that reduced ad-hoc reporting requests by 25% and improved stakeholder access to real-time insights.

Education

Master of Science in Computer/Information Technology Services Administration and Management at Avila University, USA
January 11, 2030 - April 1, 2025
Bachelor of Technology in Electronics and Communication Engineering at JNTUH, Hyderabad, Telangana
January 11, 2030 - November 1, 2020
Master of Science in Computer/Information Technology Services Administration and Management at Avila University, USA
January 11, 2030 - April 1, 2025
Bachelor of Technology in Electronics and Communication Engineering at JNTUH, Hyderabad, Telangana
January 11, 2030 - November 1, 2020

Qualifications

Add your qualifications or awards here.

Industry Experience

Healthcare, Software & Internet, Professional Services