Hi, I'm Yash Malhotra, a data engineer with 4.5+ years of experience building and supporting scalable data pipelines and data warehouse platforms within modern data mesh environments. I specialize in SQL-driven transformations, dbt-based analytics engineering, and orchestration using Airflow, delivering reliable, analytics-ready datasets under strict SLAs. I have proven experience in data migrations, pipeline optimization, and troubleshooting complex production issues across batch and streaming systems. I enjoy collaborating with product, analytics, and architecture teams in agile, distributed environments to translate requirements into scalable, fault-tolerant data solutions.

Yash Malhotra

Hi, I'm Yash Malhotra, a data engineer with 4.5+ years of experience building and supporting scalable data pipelines and data warehouse platforms within modern data mesh environments. I specialize in SQL-driven transformations, dbt-based analytics engineering, and orchestration using Airflow, delivering reliable, analytics-ready datasets under strict SLAs. I have proven experience in data migrations, pipeline optimization, and troubleshooting complex production issues across batch and streaming systems. I enjoy collaborating with product, analytics, and architecture teams in agile, distributed environments to translate requirements into scalable, fault-tolerant data solutions.

Available to hire

Hi, I’m Yash Malhotra, a data engineer with 4.5+ years of experience building and supporting scalable data pipelines and data warehouse platforms within modern data mesh environments. I specialize in SQL-driven transformations, dbt-based analytics engineering, and orchestration using Airflow, delivering reliable, analytics-ready datasets under strict SLAs.

I have proven experience in data migrations, pipeline optimization, and troubleshooting complex production issues across batch and streaming systems. I enjoy collaborating with product, analytics, and architecture teams in agile, distributed environments to translate requirements into scalable, fault-tolerant data solutions.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Senior Data Engineer at Accenture Solutions Pvt. Ltd
June 1, 2021 - Present
Owned and redesigned end-to-end ETL pipelines processing 40+ million cross-border transactions daily using Python, Spark, and BigQuery, delivering a 50% annual cost reduction and a 60% faster data turnaround. Led data migration of multiple retail domains into BigQuery, building scalable ELT pipelines with stored-procedure-based transformations, SCD Type 1 & 2, and Airflow orchestration, producing analytics-ready datasets consumed by multiple business teams for reporting and strategic decision-making. Also maintained the database layer for an internal customer query management platform, developing triggers, stored procedures, and optimized views to power dashboards and reduce average SLA resolution time by 30%. Architected and owned a real-time Kafka streaming pipeline on GCP, enabling sub-minute data availability with at-least-once delivery semantics, achieving fault tolerance, high throughput, and 30% cloud cost savings.
UG Project Intern at Central Electricity Authority, GOI
June 1, 2019 - July 1, 2019
Performed analytics on emissions from various power plants across India. Used regression models to predict pollutant emission levels, supporting the Thermal Power Efficiency and Climate Change department in preparing annual reports.
Senior Data Engineer at Accenture Solutions Pvt. Ltd
December 1, 2023 - Present
Owned and supported end-to-end ETL pipelines loading high-volume cross-border payments data into analytics platforms, processing 50M+ transactions per day under strict SLAs. Re-architected legacy DB-centric transformations into PySpark and dbt-based transformation layers on GCP Dataproc and BigQuery, optimizing partitioning, joins, and model materializations to achieve 50% cloud cost reduction and 60% end-to-end latency improvement. Designed and owned Kafka real-time streaming pipelines enabling sub-minute data availability, implementing at-least- once processing, replay safety, idempotent downstream writes, and strict offset management to ensure strong data quality and correctness, while reducing infrastructure costs by 40%. Acted as primary production on-call owner, monitoring data freshness, consumer lag, and pipeline health through observability and alerting, leading incident response, RCA, and troubleshooting for schema evolution, data consistency, and governance issues in regulat
Data Engineer at Accenture Solutions Pvt. Ltd
June 1, 2021 - November 1, 2023
Led large-scale GCP BigQuery migration, building analytics-ready ELT pipelines using dbt-style reusable transformation models, SCD Type 1/2, partitioned and clustered tables, BigQuery routines, and Airflow/Cloud Composer- orchestrated workflows, significantly improving query performance and data reliability. Owned and optimized the database layer for an internal customer query management and reporting platform, developing PL/SQL procedures, triggers, and views, reducing SLA resolution time by 30% and improving operational efficiency. Implemented data governance and quality controls by driving RCA for upstream schema changes, data skews, and late-arriving data, introducing schema versioning, validation checks, and resilient reprocessing to ensure long-term data correctness and lineage consistency. Worked closely with product, analytics, and business stakeholders to translate requirements into scalable, fault-tolerant pipelines supporting self-serve analytics and performance dashboards.

Education

B.Tech at Delhi Technological University
January 11, 2030 - January 1, 2021
12th (CBSE) at Bosco Public School
January 11, 2030 - January 1, 2017
Bachelor of Technology at Delhi Technological University
January 1, 2017 - January 1, 2021
Bachelor of Technology at Delhi Technological University
January 1, 2017 - January 1, 2021

Qualifications

Google Cloud Certified Professional Data Engineer
January 11, 2030 - January 7, 2026
Leading with Analytics - Northwestern University
January 11, 2030 - January 7, 2026
Google Cloud Certified Professional Data Engineer - Google
January 11, 2030 - January 28, 2026

Industry Experience

Healthcare, Financial Services, Retail, Professional Services, Software & Internet