Jim Marczyk

Experience Level

Expert
Expert
Expert
Expert
Expert

Work Experience

Senior Data Engineer - AWS FinOps Specialist at Boston Scientific
March 1, 2024 - December 1, 2025
Led ROI-focused cloud cost optimization for Global Supply Chain and MES teams. Rightsized EC2 and RDS Oracle using Compute Optimizer, TCS, and DBCSI, reducing monthly costs by over 60%. Automated turning off/on EC2 and RDS around shifts using Python, Lambda, and EventBridge. Secured over $1M in RDS cost savings through 3-year partial-upfront Reserved Instances. Used PySpark to compute spend per site and group assets via regex-based naming conventions.
Senior Data Engineer at Ontada/McKesson
October 1, 2023 - December 1, 2023
Built, tested, and orchestrated PySpark ETL pipelines using Databricks Workflows. Implemented data cleaning at pipeline start, logic checks post-transform, and QA at end. Invoked ML libraries inside ETL to classify sentiment of medical record attachments. Used GitHub Actions for CI/CD unit tests.
Senior Data Engineer at Signify Health
February 1, 2023 - June 1, 2023
Implemented data freshness checks before ETL execution; compared star vs snowflake schema for Redshift; archived older tables with time-based naming; troubleshot Tableau production using Performance Recorder to validate scenarios.
Senior Data Engineer at Zillow Group
December 1, 2022 - June 1, 2023
Maintained production ETL pipelines with Airflow; resolved Workday API data-source configuration issues; troubleshot Tableau production with Performance Recorder to validate scenarios.
Data Engineer at UnitedHealth Group/Optum
March 1, 2020 - May 1, 2022
Prototyped Airflow-based ETL orchestration; built Avro-encoded Kafka Producer/Consumer for ML data integration with the field-support system; developed Python data cleaning tool to detect and repair data anomalies.
Data Engineer at AT&T/DirecTV
August 1, 2019 - February 1, 2020
Built Cloudera Hadoop clusters with Yarn and PySpark on Docker and VMware; evaluated AWS/GCP/Azure vs Snowflake for data warehousing; AWS/Redshift selected for oscillatory functionality and future opportunities.
Data Engineer at Episource
August 1, 2018 - May 1, 2019
Utilized PySpark RDDs with Lambda-based processing; ran PySpark jobs on AWS EMR with logs stored in S3.
Junior Data Engineer at Hart
December 1, 2015 - April 1, 2017
Automated software testing with a CI/CD pipeline; ported a 12-page SQL Server Stored Procedure to Apache stack on AWS, reducing runtime from one week to roughly two hours.

Education

Bachelor of Science, Electrical Engineering (BS EE) at Illinois Institute of Technology
January 11, 2030 - January 9, 2026
Master of Business Administration (MBA) at National University, Costa Mesa, CA
January 11, 2030 - January 9, 2026

Qualifications

Udemy - PySpark
January 11, 2030 - January 9, 2026
Coursera - Machine Learning
January 11, 2030 - January 9, 2026
Udemy / Coursera - FinOps
January 11, 2030 - January 9, 2026
HackerRank / Coderbyte - Python, SQL
January 11, 2030 - January 9, 2026

Industry Experience

Healthcare, Real Estate & Construction, Financial Services