I am a Senior Data Scientist with around 9 years of extensive experience in data science, specializing in data mining, cleaning, visualization, and building robust statistical and machine learning models to drive business decisions. I am proficient in Python, R, SAS, SQL, and have hands-on experience with big data tools and cloud platforms. I enjoy collaborating with cross-functional teams to translate data insights into actionable solutions. Over the years, I have contributed to projects in financial services, healthcare, and software domains, leveraging my strong foundation in statistical methodologies, machine learning algorithms, and data engineering practices. My passion lies in developing scalable analytical pipelines and deploying AI-driven models to solve complex business challenges while ensuring data quality and compliance.

Pavithra

I am a Senior Data Scientist with around 9 years of extensive experience in data science, specializing in data mining, cleaning, visualization, and building robust statistical and machine learning models to drive business decisions. I am proficient in Python, R, SAS, SQL, and have hands-on experience with big data tools and cloud platforms. I enjoy collaborating with cross-functional teams to translate data insights into actionable solutions. Over the years, I have contributed to projects in financial services, healthcare, and software domains, leveraging my strong foundation in statistical methodologies, machine learning algorithms, and data engineering practices. My passion lies in developing scalable analytical pipelines and deploying AI-driven models to solve complex business challenges while ensuring data quality and compliance.

Available to hire

I am a Senior Data Scientist with around 9 years of extensive experience in data science, specializing in data mining, cleaning, visualization, and building robust statistical and machine learning models to drive business decisions. I am proficient in Python, R, SAS, SQL, and have hands-on experience with big data tools and cloud platforms. I enjoy collaborating with cross-functional teams to translate data insights into actionable solutions.

Over the years, I have contributed to projects in financial services, healthcare, and software domains, leveraging my strong foundation in statistical methodologies, machine learning algorithms, and data engineering practices. My passion lies in developing scalable analytical pipelines and deploying AI-driven models to solve complex business challenges while ensuring data quality and compliance.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Senior Data Scientist at Truist Financial
April 1, 2024 - Present
Responsible for analyzing large data sets to develop custom models and algorithms driving innovative business solutions. Handled data profiling, data cleansing including imputation and anomaly detection. Performed Market Basket Analysis, statistical trends identification using advanced statistical and actuarial methods. Executed data migration from SQL Server to Snowflake using Python and SnowQL. Managed API/Kafka data staging in Snowflake. Applied feature selection and dimensionality reduction techniques to improve model performance. Evaluated models using RMSE, confusion matrix, ROC, cross-validation, and A/B testing. Executed exploratory data analysis and developed interactive Tableau dashboards. Led AI model deployment on Azure, ensuring smooth integration and compliance. Conducted sentiment analysis on web comments and created SAS queries for reporting. Collaborated with sales and marketing teams in an Agile environment to prototype and integrate machine learning algorithms using
Senior Data Scientist at Johnson & Johnson
March 31, 2024 - July 22, 2025
Designed and developed data ingestion, aggregation, and integration processes within the Hadoop environment. Developed Sqoop scripts for incremental data loading from relational sources. Provided SQL Server DBA support and developed stored procedures for premium and claims auditing. Built a normalized database in MS Access for new projects. Developed and deployed generative AI applications using Bedrock's pre-trained foundational models and trained machine learning models with SageMaker. Supervised team activities and vendor management in big data analytics. Automated machine learning pipelines using SageMaker Pipelines. Created predictive models for customer segmentation and applied hypothesis testing for product validation. Built classification and regression models with TensorFlow, XGBoost, and LightGBM. Managed Gen AI LLM models for risk assessment using PySpark. Conducted A/B testing and statistical validation using R. Performed statistical analyses and reporting.
Data Scientist at Paradigm
July 31, 2021 - July 22, 2025
Cleaned and manipulated complex datasets for analysis and insights utilizing MS SQL server, R, Tableau, and Excel. Applied machine learning algorithms including decision trees, regression models, SVM, and clustering with scikit-learn. Conducted data preprocessing, feature engineering, and imputation. Processed structured and unstructured data with Apache Spark and Hadoop. Performed statistical analyses such as linear regression and ANOVA. Extracted big data from diverse sources into Hadoop HDFS. Utilized Python libraries for machine learning model development. Applied clustering and dimensionality reduction techniques to identify customer segments. Designed recommendation systems. Created interactive reports and dashboards, and optimized data quality and collection procedures. Employed version control and contributed to data science standards. Worked with multiple data formats and evaluated model performance with various metrics.
Data Scientist at Solugenix
December 31, 2018 - July 22, 2025
Developed data pipelines, converting disparate data sources into structured forms using SQL. Handled large volumes of customer data with 20+ features. Performed exploratory data analysis to reveal correlations, patterns, and trends using Python libraries. Conducted dimensionality reduction with PCA and feature selection. Built scalable ETL pipelines for data automation. Created dashboards and visualizations with Tableau and Power BI. Collaborated with business stakeholders to define KPIs. Trained deep learning models for image classification and NLP tasks. Applied classification models including Logistic Regression, Decision Trees, Random Forest, XGBoost, and SVM. Used SMOTE for class imbalance and performed k-fold cross-validation. Designed end-to-end data analytics and automation systems with R, Tableau, and Power BI. Analyzed customer needs with time-series models.
Senior Data Scientist at Truist Financial
April 1, 2024 - Present
Responsible for analyzing large datasets to develop custom models and algorithms to drive innovative business solutions. Conducted data profiling and anomaly handling, feature selection, and dimensionality reduction. Developed interactive dashboards and created reports using Tableau. Oversaw deployment and validation of AI models in production Azure environments for underwriting solutions. Implemented data migration from SQL Server to Snowflake and staged API data using Python. Conducted sentiment analysis and worked with various ML algorithms, including Spark MLib. Collaborated with cross-functional teams to meet business objectives in an Agile environment.
Senior Data Scientist at Johnson & Johnson
March 31, 2024 - August 5, 2025
Designed and developed data ingestion and integration in Hadoop. Developed Sqoop scripts for data import/export and incremental loading. Provided SQL Server DBA support and built normalized databases in MS Access. Developed and deployed generative AI applications using Bedrock and machine learning models with SageMaker. Supervised teams and managed vendors for big data analytics. Built ML pipelines and models for customer segmentation using Python, R, and various ML libraries. Conducted A/B testing and statistical data analysis to validate features and product changes. Worked closely with data and product teams to deliver insights and optimized model performance.
Data Scientist at Paradigm
July 31, 2021 - August 5, 2025
Cleaned and manipulated complex datasets for further analysis. Applied various machine learning and statistical algorithms using Python, R, and SQL Server. Processed structured and unstructured data using Apache Spark and Hadoop. Developed customer segmentation using clustering and dimensionality reduction techniques. Created recommendation systems and interactive dashboards. Collaborated with product managers, engineers, and analysts to define data requirements. Ensured data quality and maintained reproducible workflows using Git. Evaluated model performance with various metrics and incorporated metrics to drive business decisions.
Data Engineer at Solugenix
December 31, 2018 - August 5, 2025
Designed and implemented scalable ETL pipelines with PySpark, SQL, and Apache Airflow for healthcare data ingestion and transformation. Automated data ingestion using AWS Glue, Lambda, and S3. Built and maintained dimensional data models in Snowflake and Redshift. Optimized large-scale data queries and created healthcare-focused data marts. Developed distributed batch and real-time data processing pipelines using Spark and Hadoop. Deployed pipelines on AWS and implemented CI/CD pipelines using GitLab CI and Terraform. Ensured data quality and pipeline reliability through monitoring and alerting tools. Partnered across teams to deliver datasets for analytics and compliance reporting.
Senior Data Scientist at Truist Financial
April 1, 2024 - Present
Responsible for analyzing large datasets and developing custom models and algorithms to drive innovative business solutions. Handled data profiling, data cleaning, market basket analysis, and advanced statistical and actuarial methods. Implemented data migration from SQL Server to Snowflake, staged API/Kafka data using Python and SnowQL, and performed feature selection and dimensionality reductions. Conducted exploratory data analysis, developed interactive Tableau dashboards, deployed AI models in production on Azure, and conducted testing and validation of AI outputs. Used machine learning algorithms with Spark MLlib and Python for various business needs.
Senior Data Scientist at Johnson & Johnson
March 31, 2024 - August 22, 2025
Designed and developed data ingestion, aggregation, and integration in Hadoop environments. Developed Sqoop scripts for incremental loading and performed data analysis using complex SQL queries. Provided DBA support, created normalized databases, and deployed scalable generative AI applications using AWS Bedrock foundational models. Built and trained ML models using SageMaker, TensorFlow, XGBoost, and LightGBM. Led A/B testing, statistical validation, and developed Gen AI risk assessment models using PySpark. Provided guidance and supervision to teams and presented reports to leadership.
Data Scientist at Paradigm
July 31, 2021 - August 22, 2025
Cleaned and manipulated complex datasets for analysis and insights. Applied machine learning algorithms including decision trees, regression, SVM, and clustering. Processed structured and unstructured data using Apache Spark and Hadoop. Conducted statistical analyses and built recommendation systems using collaborative filtering. Created interactive reports and dashboards, maintained reproducible workflows, and contributed to data science standards. Evaluated models using various metrics and participated in research on new analytical technologies.
Data Engineer at Solugenix
December 31, 2018 - August 22, 2025
Designed and implemented scalable ETL pipelines with PySpark, SQL, and Apache Airflow for healthcare data analytics. Automated data ingestion using AWS Glue, Lambda, and S3. Developed reusable data transformation components, dimensional data models in Snowflake and Redshift. Optimized large-scale queries and managed healthcare data marts. Processed terabytes of claims and EMR data with Hive and Impala. Deployed end-to-end data pipelines on AWS, implemented CI/CD pipelines, and ensured data quality and pipeline reliability. Collaborated with teams to deliver high-impact datasets and established data governance practices.

Education

Add your educational history here.

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Healthcare, Professional Services, Software & Internet, Consumer Goods