Hi, I’m Soumya B Addham, a data scientist with 4+ years of experience in machine learning, NLP, time-series forecasting, and GenAI. I enjoy turning complex data into practical solutions that empower teams and drive business outcomes.\n\nI’ve built production pipelines, NLP models, and MLOps workflows, and I love delivering insightful dashboards that make data accessible across stakeholders. I’m open to new opportunities and relocating as needed.

Soumya Baddham

Hi, I’m Soumya B Addham, a data scientist with 4+ years of experience in machine learning, NLP, time-series forecasting, and GenAI. I enjoy turning complex data into practical solutions that empower teams and drive business outcomes.\n\nI’ve built production pipelines, NLP models, and MLOps workflows, and I love delivering insightful dashboards that make data accessible across stakeholders. I’m open to new opportunities and relocating as needed.

Available to hire

Hi, I’m Soumya B Addham, a data scientist with 4+ years of experience in machine learning, NLP, time-series forecasting, and GenAI. I enjoy turning complex data into practical solutions that empower teams and drive business outcomes.\n\nI’ve built production pipelines, NLP models, and MLOps workflows, and I love delivering insightful dashboards that make data accessible across stakeholders. I’m open to new opportunities and relocating as needed.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Language

English
Fluent

Work Experience

Data Scientist at Morgan Stanley USA
November 1, 2024 - Present
Led the development of large-scale financial data pipelines (PySpark, AWS Glue, and Athena), restructuring ETL flows and enforcing schema validation to improve intraday data availability by 38% and meet regulatory data quality standards. Built a production-ready Retrieval-Augmented Generation (RAG) workflow with LangChain and GPT-4, reducing analyst research time by 55% and increasing the accuracy of investment insights through enhanced retrieval precision. Developed transformer-based NLP models (RoBERTa) and Named Entity Recognition (NER) to extract counterparty, transaction, and risk-related entities from unstructured reports, boosting extraction accuracy from 82% to 94%. Implemented real-time trade-surveillance analytics using AWS Kinesis and TensorFlow, enabling proactive anomaly detection and reducing false alerts by 22%. Strengthened MLOps with model versioning, drift checks, and CI/CD pipelines, reducing release cycles by 40% while ensuring regulatory compliance. Produced intera
Jr. Data Scientist at Zensar Technologies India
January 1, 2020 - December 1, 2022
Built supervised ML models (Decision Trees, Random Forests, SVM, Naive Bayes, XGBoost) to predict customer churn and support-ticket patterns, achieving up to 92% accuracy and enabling early retention interventions. Developed time-series forecasting models using ARIMA and Facebook Prophet to project weekly demand, reducing forecast error by 27%. Led Exploratory Data Analysis (EDA) and hypothesis testing on 10M+ transactional records to identify revenue leakage and process gaps, contributing to an 18% improvement in operational efficiency. Created CNN-based image classification pipelines using Keras and PyTorch to achieve 96% precision on defect-detection tasks for a manufacturing client, reducing manual quality checks by 40%. Automated ETL pipelines in Python (NumPy, Pandas) for weekly operations, reducing processing time from 3 hours to 20 minutes, and designed Power BI dashboards to cut manual reporting time by 50%.

Education

Master in Computer Science at University of Kansas
January 11, 2030 - December 2, 2025

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Professional Services