I am a results-driven Data Scientist with five years of expertise in machine learning, NLP, and generative AI across finance, healthcare, and retail sectors. I am proficient in deploying scalable ML models and building pipelines that enable real-time insights and drive impactful decision-making. I enjoy developing AI-augmented pipelines and integrating generative AI-powered tools to streamline analytics and improve operational efficiency. I am dedicated to leveraging advanced technologies to solve complex business problems and delivering actionable insights that accelerate growth.

Dhruv Chaubey

I am a results-driven Data Scientist with five years of expertise in machine learning, NLP, and generative AI across finance, healthcare, and retail sectors. I am proficient in deploying scalable ML models and building pipelines that enable real-time insights and drive impactful decision-making. I enjoy developing AI-augmented pipelines and integrating generative AI-powered tools to streamline analytics and improve operational efficiency. I am dedicated to leveraging advanced technologies to solve complex business problems and delivering actionable insights that accelerate growth.

Available to hire

I am a results-driven Data Scientist with five years of expertise in machine learning, NLP, and generative AI across finance, healthcare, and retail sectors. I am proficient in deploying scalable ML models and building pipelines that enable real-time insights and drive impactful decision-making.

I enjoy developing AI-augmented pipelines and integrating generative AI-powered tools to streamline analytics and improve operational efficiency. I am dedicated to leveraging advanced technologies to solve complex business problems and delivering actionable insights that accelerate growth.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
See more

Work Experience

Data Science Analyst at Rebecca Everlene Trust Company
July 31, 2025 - August 6, 2025
Enabled self-serve analytics for stakeholders by integrating Generative AI-powered natural language querying into Tableau dashboards, significantly improving reporting turnaround times. Developed AI-augmented autonomous agent pipelines for data ingestion, quality assurance, and real-time reporting, increasing executive decision speed by 25%. Designed RAG pipelines using LangChain and vector databases to enhance contextual search for data needs and policy alignment. Automated ETL workflows using SQL and LLM-based agents to ensure scalable, adaptive data transformation across multiple sources.
Data Scientist at Neurotech R3, Inc
May 31, 2024 - August 6, 2025
Built and deployed a robust AWS recommendation system using SageMaker for model training and Lambda for automating real-time data processing. Streamlined MLOps pipelines integrating Jenkins CI/CD with AWS SageMaker and Glue, reducing latency by 15% and ensuring production-grade reliability. Developed comprehensive ML observability dashboards using Prometheus and Grafana to monitor model drift, latency, and feature importance. Configured Lambda functions for triggering models on S3 bucket uploads and saving predictions back to S3.
Research Assistant at NJIT
May 31, 2024 - August 6, 2025
Designed BERT models to predict loan outcomes from credit data focused on architecture, experimentation, and extracting financial insights. Developed fine-tuned NLP systems using FinBERT and Longformer models to extract entities and relationships from financial texts, validated with statistical testing.
Data Scientist at IBM
May 31, 2022 - August 6, 2025
Defined project scope, timelines, coordinated stakeholders, and deployed solutions for data engineering projects resulting in measurable business impact. Engineered PySpark pipelines processing 500 GB/day risk data and integrated Kafka streams with exactly-once semantics. Designed DBSCAN+Isolation-Forest anomaly detectors generating real-time labels for financial transactions, reducing fraud by 12%. Applied CUDA kernel fusion to fraud detection CNNs improving throughput by 35%. Trained churn prediction and revenue estimation models using LightGBM and XGBoost on user data deployed via AWS SageMaker with hyperparameter tuning. Developed deep learning models (CNN, RNN, LSTM) for credit risk forecasting using structured transaction and time-series data. Collaborated with data engineers and security teams to build ML registries using MLflow for audit-ready AI deployments.

Education

MSDS at New Jersey Institute of Technology, NJ
September 1, 2022 - May 1, 2024
BS at Uttar Pradesh Technical University
August 1, 2013 - June 1, 2017

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Healthcare, Retail, Software & Internet

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
See more