I am a Data Scientist with close to 3 years of experience applying machine learning to real-world business challenges, including credit risk and forecasting. I am proficient in Python, SQL, Scikit-learn, TensorFlow, and cloud platforms such as AWS and GCP. I pride myself on clear communication, statistical rigor, and translating models into actionable business insights. I excel at turning complex analyses into practical, measurable outcomes and enjoy building scalable ML pipelines that drive impact across education, consulting, and analytics projects.

Shreyansh Kumar

I am a Data Scientist with close to 3 years of experience applying machine learning to real-world business challenges, including credit risk and forecasting. I am proficient in Python, SQL, Scikit-learn, TensorFlow, and cloud platforms such as AWS and GCP. I pride myself on clear communication, statistical rigor, and translating models into actionable business insights. I excel at turning complex analyses into practical, measurable outcomes and enjoy building scalable ML pipelines that drive impact across education, consulting, and analytics projects.

Available to hire

I am a Data Scientist with close to 3 years of experience applying machine learning to real-world business challenges, including credit risk and forecasting. I am proficient in Python, SQL, Scikit-learn, TensorFlow, and cloud platforms such as AWS and GCP. I pride myself on clear communication, statistical rigor, and translating models into actionable business insights.

I excel at turning complex analyses into practical, measurable outcomes and enjoy building scalable ML pipelines that drive impact across education, consulting, and analytics projects.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

AI/ML Engineer at Brex
October 1, 2024 - Present
Developed an AI-Powered Expense Intelligence System offering personalized spend insights, fraud detection, and budget optimization tools for corporate clients. Collaborated with cross-functional teams and data engineers to preprocess transactional data stored on AWS S3, achieving a dataset completeness of 98%. Built reinforcement learning models in TensorFlow and OpenAI Gym to optimize real-time budget recommendations, with an F1-score of 0.87 in fraud detection. Engineered feature pipelines including user segmentation and temporal spend pattern analysis, improving fraud AUC by 12%. Used AWS SageMaker for scalable training and Bayesian optimization for fine-tuning, leading to a 25% increase in usage of smart spend insights. Validated models for regulatory compliance and deployed them using Docker, Kubernetes, and Flask APIs for integration into web and mobile platforms.
Data Scientist at Nexdigm
June 30, 2023 - August 26, 2025
Led the development of a Python-based time series forecasting tool leveraging ARIMA, SARIMA, SARIMAX, VARMAX, and RNN models, reducing forecasting time by 70%. Automated model selection and hyperparameter tuning running 50+ iterations per dataset, accelerating delivery cycles from 3 weeks to 4 days for multiple projects. Improved model accuracy by 12% via iterative tuning and advanced feature engineering, supporting reliable predictions across industries. Enhanced client engagement by 15% through timely delivery of predictive insights to strengthen enterprise customer relationships.
ML Engineer at Barclays
December 31, 2021 - August 26, 2025
Built a credit risk scoring system using ensemble models and Temporal Graph Convolutional Networks to boost AUC by 29% and improve high-risk borrower recall by 40%. Developed an end-to-end data pipeline with Azure Data Factory and Apache Spark on Azure Databricks processing millions of transactions daily with 99.5% reliability. Created custom T-GCNs with PyTorch Geometric and DGL, conducted distributed training on Azure ML, and applied advanced rolling-window aggregations and graph embeddings. Optimized latency by 35% with time-series cross-validation and hyperparameter tuning using Optuna. Deployed models on Azure Kubernetes Service using Docker and integrated CI/CD pipelines. Developed Grafana dashboards for real-time monitoring and ensuring model explainability.
Data Scientist at Nexdigm
August 1, 2022 - June 1, 2023
Led the development of a Python-based time series tool using ARIMA, SARIMA, SARIMAX, VARMAX, and RNNs, reducing exploratory forecasting time by 70% through automation and streamlined workflows. Automated model selection and hyperparameter tuning (50+ iterations per dataset), cutting project timelines from 3 weeks to 4 days. Improved model accuracy by 12% via iterative tuning and feature engineering, enabling reliable predictions and data-driven stakeholder decision-making; boosted client engagement by 15% through timely predictive insights.
Lead Data Analyst at Chicago Education Advocacy Cooperative (CHiEAC)
April 1, 2024 - Present
Designed an AI-powered, multi-agent tutoring platform that replaces static study material with adaptive explanations, quizzes, and revision workflows, delivering a 30% improvement in learner outcomes and 35% higher student engagement. Architected a scalable retrieval and reasoning pipeline with recursive content chunking, optimized embeddings, and metadata-aware vector search, boosting answer relevance and reducing topic-level confusion by 42%. Built a feedback-driven prompt tuning and evaluation loop to adapt to repeated learner queries, minimizing teacher intervention while aligning improvements with measurable academic outcomes. Developed an end-to-end intelligent document processing system combining CV enhancements, OCR, and hybrid rule-based + LLM extraction, achieving 100% processing success, 98% extraction accuracy gains, and a 90% reduction in API costs through intelligent method selection.
Data Analyst Intern at TransOrg Analytics
February 1, 2022 - June 1, 2022
Predicted sales using advanced time-series models (ARIMA, Prophet) to guide inventory decisions, resulting in a 12% reduction in overstock and a 9% decrease in stockouts. Enhanced product discovery and user engagement by implementing semantic search powered by embedding models, increasing average session duration by 30% and boosting interaction across key product categories. Increased repeat purchases by 18% by developing dynamic customer segments using clustering (K-Means, DBSCAN) and deploying personalized email campaigns based on user behavior.

Education

Master of Science in Applied Data Analytics at Boston University
September 1, 2023 - January 31, 2025
Bachelor of Technology (B.Tech) in Computer Science at Bennett University
January 1, 2019 - January 1, 2023
Master of Science in Applied Data Analytics at Boston University
September 1, 2023 - January 1, 2025
Bachelor of Technology (B.Tech), Computer Science at Bennett University
January 1, 2019 - January 1, 2023
Master of Science at Boston University
January 11, 2030 - February 26, 2026
Bachelor of Technology at Bennett University
January 11, 2030 - February 26, 2026

Qualifications

AWS Academy Cloud Foundations
January 11, 2030 - August 26, 2025
Microsoft Certified: Azure AI Fundamentals
January 11, 2030 - August 26, 2025
Introduction to Data Science
January 11, 2030 - August 26, 2025
Machine Learning with Python
January 11, 2030 - August 26, 2025
AWS Academy Cloud Foundations
January 11, 2030 - December 9, 2025
Microsoft Certified: Azure AI Fundamentals
January 11, 2030 - December 9, 2025
Introduction to Data Science
January 11, 2030 - December 9, 2025
Machine Learning with Python
January 11, 2030 - December 9, 2025
Microsoft AZ-900
January 11, 2030 - February 26, 2026
Microsoft AI 900
January 11, 2030 - February 26, 2026
Oracle Cloud Infrastructure 2021 (Cloud Operations Associate & Foundations Associate)
January 11, 2030 - February 26, 2026
IBM Developer Skills Network Certifications (Data Science, Machine Learning)
January 11, 2030 - February 26, 2026

Industry Experience

Financial Services, Professional Services, Software & Internet, Education