I am a data scientist with 3+ years of experience delivering end-to-end ML solutions across finance, retail, and healthcare. I translate large-scale structured and unstructured data into actionable insights using Python, SQL, and R, building predictive models and recommendation systems that inform business strategy. I have hands-on experience with MLOps and production deployment (Docker, Kubernetes, MLflow, CI/CD), developing real-time analytics, NLP, and Generative AI solutions (Transformers, RAG, LangChain). I collaborate with cross-functional teams to ship reliable models and dashboards.

Saffa Samreen

I am a data scientist with 3+ years of experience delivering end-to-end ML solutions across finance, retail, and healthcare. I translate large-scale structured and unstructured data into actionable insights using Python, SQL, and R, building predictive models and recommendation systems that inform business strategy. I have hands-on experience with MLOps and production deployment (Docker, Kubernetes, MLflow, CI/CD), developing real-time analytics, NLP, and Generative AI solutions (Transformers, RAG, LangChain). I collaborate with cross-functional teams to ship reliable models and dashboards.

Available to hire

I am a data scientist with 3+ years of experience delivering end-to-end ML solutions across finance, retail, and healthcare. I translate large-scale structured and unstructured data into actionable insights using Python, SQL, and R, building predictive models and recommendation systems that inform business strategy.

I have hands-on experience with MLOps and production deployment (Docker, Kubernetes, MLflow, CI/CD), developing real-time analytics, NLP, and Generative AI solutions (Transformers, RAG, LangChain). I collaborate with cross-functional teams to ship reliable models and dashboards.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Advanced

Work Experience

Data Scientist at Goldman Sachs
August 1, 2025 - Present
Led development of a production-scale Retrieval-Augmented Generation platform integrating LLaMA-3, LangChain, FAISS, and PostgreSQL to process 50M+ research documents for grounded outputs; improved analyst turnaround by 38%. Fine-tuned GPT-4 APIs with PEFT and LoRA to automate memo drafting and compliance workflows, reducing documentation overhead by 41%. Built multi-asset risk and stress-testing models with XGBoost and Monte Carlo simulations to enhance downside risk prediction. Implemented real-time market anomaly detection with PyTorch, Kafka, and Spark Structured Streaming, reducing false positives by 26% and accelerating regulatory alerts by 32%. Created document intelligence pipelines with spaCy and ONNX Runtime for 2M+ filings (90% precision). Established MLOps with MLflow, Docker, Kubernetes, and GitHub Actions for 20+ production models with 99.5% uptime. Delivered Tableau dashboards with Snowflake for risk metrics and AI insights, reducing reporting latency by 27%.
Data Scientist at LTIMindTree
April 1, 2023 - July 1, 2024
Architected end-to-end retail analytics for a large e-commerce client, delivering churn prediction and demand forecasting with XGBoost, LightGBM, CatBoost, and Scikit-learn; improved customer retention by 22% and forecast accuracy by 18%, contributing to a 12% quarterly revenue uplift. Built NLP pipelines with Transformer models (BERT, GPT-based) analyzing 2M+ reviews and transcripts, achieving 91% F1 in sentiment/intents. Developed Generative AI-powered retail assistant (LangChain + GPT) for contextual product recommendations and automated content creation, boosting engagement by 30% and conversions by 14%. Designed scalable time-series forecasting pipelines (ARIMA, Prophet) with ensemble methods to optimize inventory, reducing stock-outs by 26% and excess inventory costs by 15%. Deployed ML/Generative AI models via Docker/REST APIs with CI/CD, enabling automated monitoring, drift detection, and retraining, reducing deployment time by 35%.
Junior Data Scientist at Accenture
January 1, 2022 - March 1, 2023
Performed EDA on large-scale healthcare datasets (EHR, claims) using Python and SQL, improving data quality and consistency. Built predictive pipelines (Logistic Regression, Random Forest, XGBoost) to predict readmission with 21% accuracy gains and supported hospital risk stratification. Implemented NLP on physician notes (NLTK, TF-IDF, Word2Vec; LSTM-based models) achieving 87% F1. Developed CNN-based medical image classification models (pneumonia detection) with 92% validation accuracy. Enabled production-ready ML deployment via REST APIs (Flask/FastAPI) with monitoring; ensured HIPAA-aligned governance in AWS cloud environments.

Education

Master of Science in Data Analytics at University of Illinois Springfield
January 11, 2030 - March 27, 2026
Master of Science in Data Analytics at University of Illinois Springfield
January 11, 2030 - March 27, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Retail, Healthcare, Software & Internet, Professional Services

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more