Senior Data Engineer with 5+ years of experience specializing in data engineering, ETL pipelines, and cloud-based solutions. Proficient in Snowflake, Apache Spark, Python, and data modeling techniques.

Deepak Kumar

Senior Data Engineer with 5+ years of experience specializing in data engineering, ETL pipelines, and cloud-based solutions. Proficient in Snowflake, Apache Spark, Python, and data modeling techniques.

Available to hire

Senior Data Engineer with 5+ years of experience specializing in data engineering, ETL pipelines, and cloud-based solutions.
Proficient in Snowflake, Apache Spark, Python, and data modeling techniques.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate

Language

Bashkir
Intermediate
Javanese
Advanced

Work Experience

Senior Data Engineer at Illumina
January 1, 2025 - November 25, 2025
Migrating from HVR-based replication to Spark-driven ETL pipelines with Apache Iceberg tables on Amazon S3 and Polaris catalog, leveraging dbt and Snowflake SQL for business transformations directly on Iceberg tables. Achieved 35% gross cost reduction for onboarding data from SQL Server to Snowflake by optimizing data pipeline architecture and storage.
Data Engineer (Senior Consultant) at EY
January 1, 2025 - January 1, 2025
Developed a Python-based DAG orchestration framework, enabling execution of complex tasks with runtime DAG restructuring and a custom JSON encoder/decoder for efficient state management. Achieved $200k annual cost savings by improving task execution efficiency. Engineered a centralized financial fact table with SCD Type 2 for NCC I reporting, optimizing data processing and historical tracking. Reduced transaction discrepancies by 95% and improved regulatory reporting efficiency by 30%, minimizing compliance risks. Led the development of REST endpoints (data as a service) using FastAPI, enabling users to query comprehensive claims data spanning DB2 and Snowflake.
Data Engineer at Tata Consultancy Services
July 1, 2022 - July 1, 2022
Created a Python-based, cloud-agnostic ELT framework for Snowflake, integrating data from 12 source systems (S3, SQL Server, Profisee) into modeled Data Vault tables. Reduced data ingestion time by 40% through Connection Pool implementation, minimizing parallel database connections. Migrated on-prem ELT batch Python framework to an AWS serverless stack, including AWS Batch, S3, CloudWatch, and Fargate. Slashed operational costs by 85% and improved deployment efficiency using AWS CloudFormation templates for IaC. Integrated features like checkpoints, restart-ability, Data Quality Checks, and automated notifications for improved data integrity between RDV and Staging Area.

Education

B.Tech in Computer Science at Institute of Engineering & Management
January 11, 2030 - July 1, 2020

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services, Life Sciences, Healthcare, Financial Services