I'm a Senior Data Scientist with 9+ years of hands-on experience delivering AI/ML solutions across regulated and high-volume domains. I translate business needs into scalable data products, from data pipelines and model development to productionizing and monitoring. I thrive on collaboration with product, actuarial, and engineering teams to turn strategic objectives into robust data-driven outcomes.\n\nI enjoy driving governance, reproducibility, and explainability in production systems, and I continuously explore cutting-edge approaches such as transformers and retrieval-augmented generation to unlock business value while maintaining governance and risk controls.

Hemanth Tondur

I'm a Senior Data Scientist with 9+ years of hands-on experience delivering AI/ML solutions across regulated and high-volume domains. I translate business needs into scalable data products, from data pipelines and model development to productionizing and monitoring. I thrive on collaboration with product, actuarial, and engineering teams to turn strategic objectives into robust data-driven outcomes.\n\nI enjoy driving governance, reproducibility, and explainability in production systems, and I continuously explore cutting-edge approaches such as transformers and retrieval-augmented generation to unlock business value while maintaining governance and risk controls.

Available to hire

I’m a Senior Data Scientist with 9+ years of hands-on experience delivering AI/ML solutions across regulated and high-volume domains. I translate business needs into scalable data products, from data pipelines and model development to productionizing and monitoring. I thrive on collaboration with product, actuarial, and engineering teams to turn strategic objectives into robust data-driven outcomes.\n\nI enjoy driving governance, reproducibility, and explainability in production systems, and I continuously explore cutting-edge approaches such as transformers and retrieval-augmented generation to unlock business value while maintaining governance and risk controls.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent
Afar
Advanced
Bashkir
Advanced

Work Experience

Senior Data Scientist – AI/ML at Fifth Third Bank, Cincinnati, OH
December 1, 2023 - Present
Developed multi-objective underwriting risk models using Python and Spark on Databricks to optimize loss ratio and premium yield simultaneously. Architected hybrid batch/stream feature pipelines enabling synchronized training and real-time inference. Implemented graph-based fraud analytics using Neo4j, reducing false positives. Deployed transformer NLP services for claim narrative triage. Applied explainability and fairness techniques to ensure compliance. Migrated legacy ML jobs to Azure Databricks, improving batch processing. Established MLflow Registry and DVC artifact lineage to standardize workflows. Used Ray Tune to accelerate hyperparameter tuning on Kubernetes. Implemented real-time anomaly detection, composite forecasting, and optimized model serving with BentoML and Kubernetes autoscaling. Built retrieval-augmented generation prototypes and a multi-arm bandit experimentation platform. Managed infrastructure as code with Terraform and Helm. Integrated data validation and conti
Machine Learning Engineer at Navi health, Brentwood, TN
December 1, 2023 - July 31, 2025
Implemented credit default scoring pipelines using CatBoost and XGBoost on AWS EMR and BigQuery ML, significantly reducing portfolio risk. Engineered auto-scaling Spark Structured Streaming jobs with Kafka ingestion for near-real-time behavioral features. Built containerized model gateway with canary deployments on GKE using Argo Rollouts. Centralized experiment tracking with MLflow and MinIO. Developed explainability dashboards for compliance analysts. Designed Airflow and DBT ELT architecture, improving job durations. Implemented data quality checks blocking invalid payloads. Optimized hyperparameter tuning with Optuna distributed on Kubernetes. Established GitOps workflow for deployment parity. Transitioned batch scoring to streaming inference, reducing fraud risk notification latency. Integrated SageMaker processing for large SHAP computations. Performed adversarial robustness testing on NLP models. Developed uplift modeling frameworks guiding retention campaigns. Captured lineage
Machine Learning Engineer at First Republic Bank, San Francisco, CA
September 1, 2021 - July 31, 2025
Built demand forecasting systems combining Prophet, ARIMA, and gradient boosting ensembles, stabilizing supply planning. Developed CNN and BiLSTM models for image and sequence classification on Azure Kubernetes Service. Implemented MLflow tracking with S3 backend for reproducibility and rollback. Engineered PySpark feature pipelines optimizing runtime and cost. Created low-latency sentiment analysis services powering customer experience dashboards. Optimized inference using ONNX and TensorRT. Established CI/CD templates with automated vulnerability scans. Integrated Kafka and Spark Streaming for incremental feature updates. Enforced comprehensive data validation gating model training. Authored modular training libraries enhancing team productivity. Developed A/B experimentation frameworks for uplift measurement. Implemented Prometheus and Grafana monitoring production services. Conducted cost optimization and privacy-preserving transforms. Facilitated architecture and code reviews esta
Data Scientist at Impetus Technologies, Hyderabad, India
August 31, 2019 - July 31, 2025
Prototyped classification and regression models informing pricing and retention strategies. Conducted exploratory data analysis with visual profiling. Authored SQL and HiveQL scripts for ETL pipeline ingestion and partitioned Hive tables. Built ETL workflows integrating Sqoop and delta imports using Oozie and Python. Applied clustering to derive customer segments guiding engagement. Created Tableau dashboards for churn and lifetime value KPIs. Applied time series forecasting to anticipate demand. Automated nightly scoring scripts and migrated to Airflow DAGs. Developed model evaluation reports ensuring governance. Optimized Hive queries reducing latency. Documented experiments and data dictionaries for transparency. Maintained Git workflows and code reviews. Integrated Flask APIs for prototype model serving. Wrote data validation tests using Great Expectations. Collaborated on feature engineering heuristics and sampling strategies. Supported pipeline triage improving SLA adherence.
Junior Data Scientist at CSS Corp Pvt Ltd., Chennai, India
April 1, 2017 - July 31, 2025
Cleaned and standardized multi-source operational data for analytics readiness. Modeled relational schemas and optimized SQL Server queries for reports. Developed SSIS ETL packages automating daily ingestion from multiple sources. Authored SSRS executive dashboards enabling KPI transparency. Developed QlikView visual apps for sales and inventory insights. Created reusable stored procedures supporting ad-hoc reporting. Diagnosed ETL anomalies reducing resolution time. Produced financial variance analyses for leadership. Gathered KPI requirements translating into semantic layer designs. Version-controlled SQL and ETL assets. Documented data flow diagrams and business definitions. Trained end users on BI tools. Implemented monitoring and alerting for job health. Generated weekly data quality scorecards driving remediation. Presented actionable insights informing business decisions.
Sr. Data Scientist – AI/ML at Fifth Third Bank
December 1, 2023 - Present
Led multi-objective underwriting risk modeling (Python, Scikit-learn, LightGBM) on Databricks to balance loss ratio and premium yield. Managed large-scale hyperparameter tuning with Ray Tune on Kubernetes (EKS) and maintained observable AI services (SLOs/SLIs) via Prometheus/Grafana. Architected hybrid batch/stream feature pipelines (Airflow, Kafka, Feast) for synchronized training and real-time inference. Implemented graph-based fraud analytics with Neo4j GDS and deployed NLP services (Hugging Face, FastAPI, ONNX Runtime) for claim narrative triage, achieving substantial response-time reductions. Applied SHAP and Fairlearn to quantify feature impact and demographic parity for governance. Migrated legacy scoring jobs to Azure Databricks with Delta Lake to reduce batch duration. Established MLflow Registry and DVC for model reproducibility; introduced Ray Tune for efficient hyperparameter searches on Kubernetes. Implemented real-time anomaly detection with Flink SQL and Kinesis Data Str
Machine Learning Engineer at Navi health
December 1, 2023 - September 24, 2025
Implemented credit default scoring pipelines using CatBoost/XGBoost on AWS EMR and BigQuery ML; built near-real-time behavioral features via Spark Structured Streaming with Kafka. Created containerized model gateway (FastAPI, NGINX, Docker) on GKE with canary releases managed by Argo Rollouts. Centralized experiment/artifact tracking with MLflow + MinIO; introduced SHAP-based explainability dashboards for governance. Migrated ETL/ELT to modular DBT + Airflow pipelines with Great Expectations data contracts. Conducted adversarial robustness testing, uplift modeling, and OpenLineage/Marquez lineage capture for end-to-end observability. Implemented secure secret management (Vault), CI/CD (GitLab CI), and secure secret handling; deployed SageMaker Processing for SHAP calculations; built a governance portal for registry status and drift indicators. Improved prediction latency and self-service analytics via BigQuery BI Engine; used SMOTE/CTGAN for minority augmentation; promoted responsible
Machine Learning Engineer at First Republic Bank
September 1, 2021 - September 24, 2025
Built demand forecasting architecture combining Prophet, ARIMA, and gradient-boosting ensembles to stabilize supply planning. Developed TensorFlow CNN and BiLSTM models for image/sequence classification, deployed via Azure/Kubernetes inference layer. Implemented MLflow tracking with S3 backend and artifact versioning for rapid rollback. Refactored PySpark feature pipelines with window functions to minimize runtime. Created low-latency sentiment analysis service (spaCy, Transformers) powering live dashboards. Enabled CI/CD templates for containerized services and integrated Kafka + Spark Structured Streaming for incremental feature updates. Conducted model monitoring with Prometheus/Grafana; implemented A/B testing and monitoring of drift with Evidently. Delivered synthetic minority oversampling (SMOTE, CTGAN) to address dataset imbalance and led governance initiatives with OpenLineage/Marquez integration.
Machine Learning Engineer at Impetus Technologies
August 1, 2019 - September 24, 2025
Implemented forecasting (Prophet, ARIMA) and ensemble methods for demand planning; built CNN/BiLSTM models for sequence/image tasks. Created scalable PySpark feature pipelines with Kafka + Faust for real-time scoring. Centralized experiment tracking with MLflow + MinIO; established SHAP dashboards for explainability. Replaced monolithic ETL with modular Airflow + DBT ELT workflows; added data quality gates with Great Expectations. Optimized hyperparameter search with Optuna on Kubernetes and implemented GitOps (GitLab CI, Terraform, Helm) for parity across environments. Integrated SageMaker Processing for large-scale SHAP computations and conducted uplift analyses with causal models (Causal Forest, XLearner). Implemented lineage capture (OpenLineage, Marquez) and built governance portals showing drift and registry status.
Data Scientist at CSS Corp Pvt Ltd.
April 1, 2017 - September 24, 2025
Prototyped forecasting and ML-driven pricing/retention hypotheses; built sentiment analysis service (spaCy, Transformers) for real-time dashboards. Developed PySpark feature pipelines with Delta Lake optimization; implemented end-to-end ML lifecycle with MLflow tracking and artifact versioning. Created modular training libraries and serving layers (REST APIs) with FastAPI/Flask; migrated batch scoring to streaming via Kafka + Faust. Implemented data quality controls with Great Expectations and JSON Schema; established CI/CD templates for containerized services and vulnerability scanning. Delivered Tableau dashboards and self-service analytics; facilitated cross-team collaboration and documented runbooks for incident response.
Junior Data Scientist at Impetus Technologies
August 31, 2015 - September 24, 2025
Notable early work included exploratory data analysis, feature engineering, and building baseline models (scikit-learn) supported by ETL processes in Hadoop/HDFS. Delivered SQL-based reporting, implemented ETL with Oozie, and created dashboards for marketing and operations stakeholders. Contributed to model governance with validation reports and notebooks; supported data quality initiatives and collaborated with cross-functional teams to translate business needs into analytical solutions.

Education

Bachelor of Technology at JNTUH, Hyderabad, Telangana, India
January 1, 2012 - December 31, 2016
Bachelor of Technology (B.Tech) in Information Technology at JNTUH, Hyderabad, Telangana, India
January 11, 2030 - January 1, 2016

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Healthcare, Software & Internet, Professional Services, Computers & Electronics