I'm a detail-oriented data engineer with 5+ years of experience designing, building, and optimizing data pipelines and ETL workflows across financial and healthcare environments. I specialize in Python, PySpark, and AWS to deliver scalable data solutions. I thrive in collaborative teams, embrace data validation and security, and continuously look for ways to reduce latency and cost. I've worked with Truist Bank, UnitedHealth Group, and HDFC Bank to improve reliability and governance.

Preethi Reddy Thumma

I'm a detail-oriented data engineer with 5+ years of experience designing, building, and optimizing data pipelines and ETL workflows across financial and healthcare environments. I specialize in Python, PySpark, and AWS to deliver scalable data solutions. I thrive in collaborative teams, embrace data validation and security, and continuously look for ways to reduce latency and cost. I've worked with Truist Bank, UnitedHealth Group, and HDFC Bank to improve reliability and governance.

Available to hire

I’m a detail-oriented data engineer with 5+ years of experience designing, building, and optimizing data pipelines and ETL workflows across financial and healthcare environments. I specialize in Python, PySpark, and AWS to deliver scalable data solutions.

I thrive in collaborative teams, embrace data validation and security, and continuously look for ways to reduce latency and cost. I’ve worked with Truist Bank, UnitedHealth Group, and HDFC Bank to improve reliability and governance.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert

Work Experience

Data Engineer at Truist Bank
August 1, 2023 - Present
Designed and maintained end-to-end ETL pipelines using AWS Glue, Airflow, and Python, improving data availability by 30%. Implemented real-time streaming with Kafka and AWS Kinesis, reducing reporting delays by 40%. Migrated SQL-based data warehouse to Snowflake and AWS Redshift, cutting infrastructure costs by 20%. Enhanced validation frameworks with dbt test, Great Expectations, and pytest; automated deployments using Terraform, AWS CDK, and CloudFormation; managed IAM roles and KMS encryption policies; cross-cloud data replication with Azure Data Factory; developed KPI dashboards in Tableau.
Data Engineer at UnitedHealth Group (UHG)
April 1, 2020 - July 31, 2022
Built distributed PySpark pipelines on AWS EMR and Glue to transform large healthcare datasets. Migrated legacy batch workflows to real-time ingestion using Kafka + Kinesis, reducing latency by 40%. Deployed dbt models and data marts in Snowflake for claims and provider data analysis. Automated pipeline validation using pytest and dbt test, ensuring consistent schema compliance. Provisioned infrastructure with Terraform + CloudFormation, improving setup time by 50%. Monitored ETL performance using Datadog and CloudWatch, cutting data failure rates by 25%. Configured API integrations with AWS API Gateway and boto3 for dynamic data exchange.
Data Engineer at HDFC Bank
June 1, 2019 - March 1, 2020
Developed SQL and Python-based ETL pipelines for regulatory and financial reporting. Optimized data loading processes with AWS Glue jobs, improving performance by 30%. Migrated datasets from Oracle to AWS RDS, maintaining referential integrity and schema alignment. Built automated pytest-based validation scripts to ensure completeness and accuracy of migrated data. Designed Tableau dashboards for regional performance reporting and executive analytics.

Education

Master of Science at Saint Louis University
January 11, 2030 - January 8, 2026

Qualifications

AWS Certified Data Analytics – Specialty – In Progress
January 11, 2030 - January 8, 2026
Microsoft Azure Fundamentals (AZ-900)
January 11, 2030 - January 8, 2026
Oracle Cloud Infrastructure
January 11, 2030 - January 8, 2026

Industry Experience

Software & Internet, Healthcare, Financial Services