Available to hire
I am a data scientist and AI/ML engineer focusing on Generative AI, LLMs, and enterprise-grade AI solutions. I specialize in building LLM-powered applications, RAG pipelines, chatbots, and knowledge assistants using LangChain, Hugging Face Transformers, spaCy, and PyTorch.
I design end-to-end ML and GenAI workflows, from data ingestion and feature engineering to training and production deployment on AWS, Azure, and GCP. I also implement MLOps, model monitoring, explainability, and governance to ensure reliable, auditable AI in regulated environments.
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Language
English
Fluent
Work Experience
Data Scientist – Generative AI / AI-ML at PNC Bank
November 1, 2024 - November 24, 2025Designed and deployed LLM-based NLP systems using Hugging Face Transformers, LangChain, and spaCy to power real-time digital-banking chatbots via Python FastAPI microservices on Docker/Kubernetes (AWS EKS). Implemented Retrieval-Augmented Generation (RAG) pipelines with PostgreSQL vector stores and LangChain agents to improve conversational accuracy. Developed recommendation models combining collaborative filtering and reinforcement learning, exposed as REST APIs via FastAPI and orchestrated with Airflow for continuous retraining. Automated LLM/NLP model retraining and versioning with MLflow and Airflow; containerized for reproducible CI/CD on Kubernetes. Created feature-engineering pipelines from transactional/behavioral data using Python and Spark; supported data pipelines on AWS Glue and S3. Built sentiment-analysis and text-classification models (RoBERTa, BERT); integrated prompt orchestration with LangChain for real-time insights. Embedded model explainability (SHAP, LIME) into Fa
Data Scientist – AI/ML at Spencer Health Solutions
October 1, 2024 - October 1, 2024Designed and deployed LLM-based NLP pipelines using Python, BERT, and PyTorch to extract clinical insights from medical literature and patient notes, supporting evidence-driven decision making. Built predictive models for patient adherence and risk stratification using scikit-learn, XGBoost, and PySpark; integrated with real-time EHR data through Azure Machine Learning. Automated feature engineering workflows in Azure Data Factory and Databricks; applied transfer learning and fine-tuning on transformer models. Implemented end-to-end MLOps pipelines with Airflow, MLflow, Docker, and Kubernetes, managing model retraining, deployment, and version control within Azure environments. Engineered feature stores and time-series models in Azure Synapse Analytics for patient monitoring and outcome prediction. Built FastAPI services to expose model predictions as secure REST endpoints, enabling seamless integration with Spencer’s digital health platforms. Incorporated model interpretability fram
AI/ML Engineer at New York Life Insurance
November 1, 2023 - November 1, 2023Developed and deployed end-to-end ML pipelines using Python and scikit-learn, integrating feature engineering, model training (Random Forest, XGBoost), and SQL (PostgreSQL/SQL Server) ETL workflows to automate risk analytics and loss forecasting. Designed and deployed end-to-end ML pipelines in Python (scikit-learn, XGBoost) integrating feature engineering, model training, and SQL-based ETL workflows for underwriting, claims, and fraud analytics. Built and operationalized predictive models (Random Forest, Gradient Boosting, LightGBM, CatBoost) using Azure Machine Learning, enabling automated risk scoring and policyholder churn analysis. Developed Python ETL scripts connecting SQL Server and PostgreSQL databases to Azure ML pipelines, improving data readiness for classification and regression use cases. Employed MLflow for experiment tracking, hyperparameter tuning, and model versioning; maintained audit trails and model lineage across environments. Created AI-powered document and claim
Data Scientist at Siemens
August 1, 2018 - August 1, 2018Collected, cleaned, and processed IoT sensor and machine log data using Python, Pandas, and SQL for model training and trend analysis. Built and deployed predictive maintenance models using scikit-learn to forecast equipment failures and optimize maintenance schedules. Applied time-series analysis and anomaly detection to identify early warning signals in sensor data streams. Created data visualization dashboards using Tableau and Matplotlib for monitoring equipment health and plant efficiency. Collaborated with production and maintenance teams to validate model results and integrate outputs into real-time monitoring systems. Performed feature engineering on telemetry data and optimized model accuracy using cross-validation and hyperparameter tuning. Automated data ingestion and preprocessing workflows using Python scripts and shell scheduling.
Data Engineer at ICICI Lombard General Insurance
September 1, 2017 - September 1, 2017Designed and developed ETL pipelines using Python, PySpark, and SQL to process large-scale claim, policy, and premium datasets. Built and automated data ingestion workflows from Oracle and flat-file sources into Hive and Hadoop (MapR) clusters for analytics and reporting. Implemented data quality checks and validation frameworks ensuring accuracy and reliability across claims and underwriting systems. Developed data models and dashboards in Tableau and Power BI to visualize claim ratios, fraud trends, and policy performance metrics. Optimized Spark jobs and SQL queries for high-volume healthcare datasets, improving load performance by 30%. Collaborated with actuarial and risk teams to support predictive modeling for claim probability and renewal forecasting. Automated daily and monthly batch pipelines using shell scripts and schedulers to streamline insurance data refresh cycles. Ensured compliance with data privacy and governance standards (IRDAI) across all analytical datasets.
Education
Qualifications
Industry Experience
Financial Services, Healthcare, Life Sciences
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Pittsburgh today.