Available to hire
I’m a Data Engineer with over 5 years of experience building data pipelines, cloud platforms, and real-time analytics. I love solving data problems and turning complex datasets into reliable, scalable solutions.
I enjoy working with Python, PySpark, and Databricks, and have hands-on experience across AWS, Azure, and GCP. I’ve designed and optimized ETL workflows with Airflow, DBT, and Terraform, and worked with Snowflake, BigQuery, and SQL for modeling and reporting. I’m familiar with Kafka, Flink, and Kinesis for streaming data, and I focus on creating clean, maintainable data systems that empower analytics and machine learning.
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Language
English
Fluent
Work Experience
Data Engineer at Johnson & Johnson
September 1, 2023 - PresentDesigned scalable ETL pipelines using PySpark and Databricks, processing over 5TB of healthcare data daily from medical sources. Implemented real-time ingestion and analytics pipelines leveraging Flink, Kafka, and Kinesis; containerized microservices with Docker and Kubernetes; automated infrastructure provisioning with Terraform and CloudFormation. Ensured data security via IAM/RBAC and encryption. Built modular data pipelines with Airflow, DBT, and data-quality tooling to enable rapid data product delivery and governance across distributed health systems.
Data Engineer at Prudential Financial
October 1, 2021 - August 1, 2023Designed scalable ETL pipelines using Databricks and Airflow to process insurance and financial data across 10+ domains; migrated legacy batch jobs to modular Spark-based workflows; established data models in Snowflake and BigQuery for KPI reporting; implemented data quality checks, RBAC, and data masking to support HIPAA and GDPR compliance.
Data Engineer at MetLife
August 1, 2020 - August 31, 2021Built scalable ETL pipelines using PySpark and AWS Glue to ingest healthcare data from EHRs, eligibility feeds, and claims; implemented batch and streaming workflows with Kafka, Spark Structured Streaming, and Kinesis Firehose to improve real-time claims processing; migrated legacy ETL scripts to Spark-based pipelines and validated data quality with Great Expectations and SQL-driven checks; supported CI/CD and data governance.
Data Engineer at Prudential Financial, Inc.
October 1, 2021 - August 1, 2023Designed scalable ETL pipelines using Azure Data Factory, Matillion, and Apache Airflow, automating ingestion from RDBMS and APIs into Azure Data Lake and BigQuery. Built modular data processing workflows with Databricks notebooks (PySpark) across 10+ business domains. Migrated on-prem workloads (SQL Server, Oracle, MongoDB) to cloud-native stores (Snowflake, BigQuery), reducing infra costs by ~40%. Supported hybrid cloud adoption by integrating legacy Azure pipelines with new datasets in GCP and event triggers via Cloud Pub/Sub. Developed data quality rules with Informatica Data Quality and automated validation checks with PySpark. Modeled data using Star and Snowflake schemas in Snowflake and Azure Synapse Analytics. Enabled data mesh practices with Azure Data Factory, DBT, and Power BI; built dashboards with Power BI/Looker; integrated Kafka and Azure Event Hubs for hybrid streaming; standardized SQL transformations with DBT; scheduled batch/incremental jobs with Matillion and Airfl
Data Engineer at MetLife, Inc.
August 1, 2020 - September 1, 2021Built scalable ETL pipelines using PySpark and AWS Glue to ingest and transform healthcare data from EHR systems, eligibility databases, and claim feeds. Implemented batch and streaming data workflows with Kafka, Spark Structured Streaming, and Kinesis Firehose for real-time claims ingestion. Migrated legacy ETL scripts to modern Spark-based pipelines and performed performance tuning. Wrote transformation logic in dbt for analytics layers in Snowflake and Redshift. Created dashboards in Tableau and Power BI; automated data quality checks with Great Expectations and Python-based validations. Participated in CI/CD with Jenkins, GitHub Actions, and Terraform; managed schema evolution with AWS Glue Data Catalog and tagging for lineage visibility. Built basic API integrations with Flask and containerized services with Docker. Monitored workloads with CloudWatch and Grafana; contributed to data governance initiatives (RBAC, IAM, KMS encryption) and supported SageMaker model integration. Atte
Education
Master of Science: Business Analytics at University of New Haven
January 11, 2030 - December 20, 2025Master of Science in Business Analytics at University of New Haven
January 11, 2030 - March 29, 2026Master of Science: Business Analytics at University of New Haven
January 11, 2030 - March 29, 2026Master of Science: Business Analytics at University of New Haven
January 11, 2030 - March 29, 2026Qualifications
Azure Data Engineer Associate
January 11, 2030 - December 20, 2025Azure Data Engineer Associate
January 11, 2030 - March 29, 2026AWS Certified Data Engineer - Associate
January 11, 2030 - March 29, 2026Azure Data Engineer Associate
January 11, 2030 - March 29, 2026AWS Certified Data Engineer - Associate
January 11, 2030 - March 29, 2026Azure Data Engineer Associate
January 11, 2030 - March 29, 2026AWS Certified Data Engineer - Associate
January 11, 2030 - March 29, 2026Industry Experience
Healthcare, Life Sciences, Financial Services, Software & Internet, Professional Services, Education
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Santa Clara today.