Available to hire
Hi, I’m Soumya B Addham, a data scientist with 4+ years of experience in machine learning, NLP, time-series forecasting, and GenAI. I enjoy turning complex data into practical solutions that empower teams and drive business outcomes.\n\nI’ve built production pipelines, NLP models, and MLOps workflows, and I love delivering insightful dashboards that make data accessible across stakeholders. I’m open to new opportunities and relocating as needed.
Skills
Language
English
Fluent
Work Experience
Data Scientist at Morgan Stanley USA
November 1, 2024 - PresentLed the development of large-scale financial data pipelines (PySpark, AWS Glue, and Athena), restructuring ETL flows and enforcing schema validation to improve intraday data availability by 38% and meet regulatory data quality standards. Built a production-ready Retrieval-Augmented Generation (RAG) workflow with LangChain and GPT-4, reducing analyst research time by 55% and increasing the accuracy of investment insights through enhanced retrieval precision. Developed transformer-based NLP models (RoBERTa) and Named Entity Recognition (NER) to extract counterparty, transaction, and risk-related entities from unstructured reports, boosting extraction accuracy from 82% to 94%. Implemented real-time trade-surveillance analytics using AWS Kinesis and TensorFlow, enabling proactive anomaly detection and reducing false alerts by 22%. Strengthened MLOps with model versioning, drift checks, and CI/CD pipelines, reducing release cycles by 40% while ensuring regulatory compliance. Produced intera
Jr. Data Scientist at Zensar Technologies India
January 1, 2020 - December 1, 2022Built supervised ML models (Decision Trees, Random Forests, SVM, Naive Bayes, XGBoost) to predict customer churn and support-ticket patterns, achieving up to 92% accuracy and enabling early retention interventions. Developed time-series forecasting models using ARIMA and Facebook Prophet to project weekly demand, reducing forecast error by 27%. Led Exploratory Data Analysis (EDA) and hypothesis testing on 10M+ transactional records to identify revenue leakage and process gaps, contributing to an 18% improvement in operational efficiency. Created CNN-based image classification pipelines using Keras and PyTorch to achieve 96% precision on defect-detection tasks for a manufacturing client, reducing manual quality checks by 40%. Automated ETL pipelines in Python (NumPy, Pandas) for weekly operations, reducing processing time from 3 hours to 20 minutes, and designed Power BI dashboards to cut manual reporting time by 50%.
Education
Master in Computer Science at University of Kansas
January 11, 2030 - December 2, 2025Qualifications
Industry Experience
Financial Services, Professional Services
Skills
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Austin today.