I am a Lead Data Scientist with over a decade of experience transforming raw data into actionable products, prototyping data pipelines, and promoting data governance across academia and industry. I enjoy collaborating with researchers and stakeholders to deliver data strategies that align with organizational goals and drive tangible impact. My work spans designing ML prototypes, validating models, and building end-to-end data platforms using Python, SQL, Tableau, AWS, and Azure. I thrive in cross-functional teams and love turning complex data challenges into accessible solutions that researchers and business partners can act on.

Khushboo Gupta

I am a Lead Data Scientist with over a decade of experience transforming raw data into actionable products, prototyping data pipelines, and promoting data governance across academia and industry. I enjoy collaborating with researchers and stakeholders to deliver data strategies that align with organizational goals and drive tangible impact. My work spans designing ML prototypes, validating models, and building end-to-end data platforms using Python, SQL, Tableau, AWS, and Azure. I thrive in cross-functional teams and love turning complex data challenges into accessible solutions that researchers and business partners can act on.

Available to hire

I am a Lead Data Scientist with over a decade of experience transforming raw data into actionable products, prototyping data pipelines, and promoting data governance across academia and industry. I enjoy collaborating with researchers and stakeholders to deliver data strategies that align with organizational goals and drive tangible impact.

My work spans designing ML prototypes, validating models, and building end-to-end data platforms using Python, SQL, Tableau, AWS, and Azure. I thrive in cross-functional teams and love turning complex data challenges into accessible solutions that researchers and business partners can act on.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Intermediate

Language

English
Fluent

Work Experience

Research Associate at National University of Singapore
January 1, 2024 - December 31, 2025
Designed a prototype to create a machine learning problem statement to predict passengers' seating discomfort scores using in-chair movement-based features for a long-haul flight. Transformed raw NUS data into standalone data products for researchers and staff, reducing prediction error by 20%. Conducted hypothesis testing and MANOVA to identify effective features and drove operational improvements in research applications. Explored intervention strategies at regular intervals to mitigate sitting discomfort and compared regression slopes of discomfort feedback before and after interventions. Prepared research findings for industry presentations and academic journal submissions.
Educator at Testbook
July 1, 2023 - December 31, 2023
Data Science faculty teaching Statistics, Machine Learning, Python, Tableau, SQL and Power BI to remote learners.
Senior Quantitative Analytics Specialist at Wells Fargo
May 1, 2021 - June 30, 2023
ML Model validator for financial services partners. Conducted data integrity checks, data reconciliation tests, surrogate modelling, and model risk ranking. Developed an ML model that analyzes PII data to assign fraud risk scores (0–100) and generate actionable reason codes for email addresses across digital credit card, consumer deposit, and retail POS portfolios, enabling proactive fraud identification. Evaluated and improved model performance using KS, PSI, ROC, and AUC metrics. Built Gradient Boosting challenger models for high-risk transactions. Performed descriptive statistics, EDA, and rapid prototyping to generate actionable business insights.
Senior Data Scientist at Ernst & Young
May 1, 2018 - April 30, 2021
Led the architecture and design of data science and machine learning solutions for clients in the FMCG and Transportation sectors. Managed a team of 5 data scientists. Improved time series forecast accuracy with ARIMA and Prophet in Dataiku. Performed clustering using topic modelling and NLP (Hugging Face). Deployed object detection models (TensorFlow) in Azure Kubernetes, integrating results into Power BI. Designed computer vision solutions for safety and attire compliance using CNNs and end-to-end MLOps pipelines. Automated barcode recognition with Pyzbar, reducing penalties and increasing profitability. Built license plate detection solutions for trucks using Power Apps and Raspberry Pi.
Data Scientist at Concentrix Catalyst Technologies Private Limited (Formerly ProKarma Softech Pvt Ltd)
May 1, 2016 - May 31, 2018
Used Bayesian networks to identify causal variables and predict velocity dips across routes. Conducted statistical tests (Z-test, CLT, t-test) to evaluate Trip Optimizer effectiveness on large railroad datasets. Built multiple regression models in Python to predict C-rate. Developed data ingestion, tagging, and cleaning processes using SQL and advanced statistical methods. Automated workflows with Azure ML pipelines. Improved model accuracy and feature selection using Random Forests and Neural Networks for fuel prediction and savings analysis.
Data Scientist at Karvy Analytics Limited
September 1, 2015 - May 31, 2016
Designed recommendation engines using Python and Gradient Boosting for an e-commerce platform, improving product relevance by 25% and reducing false positives in fraud detection by 40%. Automated data preprocessing pipelines with SQL and Azure ML, enhancing operational efficiency for telecom domain datasets.
Data Scientist at Cognizant Technology Solutions
September 1, 2014 - July 31, 2015
Designed the prototype of a rule-based fraud detection engine capable of predicting the propensity of fraud claims in Health insurance to flag and prioritize suspicious cases. Predicted telecom churn using logistic regression for binary classification.

Education

Master of Science - Applied Statistics & Informatics at Indian Institute of Technology, Bombay, Mumbai
July 1, 2012 - May 1, 2014
Bachelor of Science (B.S.) - Statistics, Mathematics & Computer science at Banasthali Vidyapith University, Jaipur
June 1, 2009 - May 1, 2012

Qualifications

Introduction to Generative AI Learning Path Specialization
January 1, 2025 - January 7, 2026
Natural Language Processing Specialization
January 1, 2024 - January 7, 2026
Core Designer certificate
January 1, 2020 - January 7, 2026
Deep Learning Specialization
January 1, 2020 - January 7, 2026
Certified Data Science and Machine Learning Essentials
January 1, 2015 - January 7, 2026

Industry Experience

Financial Services, Professional Services, Software & Internet, Transportation & Logistics, Media & Entertainment, Education