I'm a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets. I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.

Prabhakar Jagtap

I'm a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets. I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.

Available to hire

I’m a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets.

I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

Sr. Data Engineer at I Cube IT Systems Ltd.
April 1, 2023 - Present
Designed, built, and maintained scalable ELT pipelines using Snowflake, Apache Airflow, PySpark, AWS, Fivetran, and dbt, supporting both real-time and batch analytics workloads. Implemented data quality, observability, and reliability frameworks, proactively monitoring pipeline health, data freshness, schema changes, and anomaly detection. Automated data ingestion and integration pipelines using Fivetran, reducing manual intervention and accelerating data availability for analytics and reporting. Developed and optimized dbt transformation models, enabling analytics-ready datasets, improved data lineage, and standardized business logic across teams. Optimized Snowflake data warehouse performance and cost efficiency through partitioning strategies, clustering, query tuning, and workload isolation. Built and maintained PySpark-based distributed processing pipelines to handle large-scale datasets efficiently within cloud environments. Collaborated with BI, analytics, and architecture teams
Data Engineer at TCS, CIBC, Toronto
July 1, 2019 - March 1, 2023
Designed and optimized end-to-end ETL pipelines using Azure Databricks, Apache Spark, and Delta Lake; load data into Azure Data Lake Gen2; implemented Snowpipe for automated data ingestion and utilized Snowflake Clone and Time Travel features. Built and supported large-scale ETL pipelines in Azure Data Factory for reliable data movement. Configured Linked Services, Datasets, and Integration Runtimes to connect multiple data sources and load data into data stores. Implemented security and governance through Databricks Unity Catalog and integrated Azure Key Vault for secrets management. Collaborated with DevOps teams to implement automated CI/CD and test-driven deployment pipelines using Azure DevOps. Partnered with business stakeholders to translate mapping documents into accurate source-to-target transformations.
Data Engineer at TCS, RBC, Canada
June 1, 2016 - June 1, 2019
Extensively used Spark datasets, Scala, Shell scripting, and Hive for data processing and analytics. Designed an optimal storage architecture balancing storage cost and computing performance. Gained strong understanding of insurance claims, admin systems, and IFRS 17 requirements and implemented them using technical expertise within the data lake. Utilized PySpark, Anaconda Python, and Pandas for large-scale data transformation in the Hortonworks Data Lake. Designed and implemented efficient database solutions using Azure Blob Storage for secure data storage and retrieval. Prepared comprehensive documentation and analytical reports for stakeholders. Deployed Azure Data Factory to build and orchestrate data pipelines for loading data into SQL databases.
Big Data Developer at Syntel, India
June 1, 2011 - May 1, 2016
Utilized data stored in the data lake to generate quarterly data extracts for external vendors. Selected appropriate storage formats and developed ad-hoc queries and Hive routines for analytical purposes. Designed and developed new databases and data schemas for high-profile, customer-facing portals with a strong focus on data integrity and query performance using Azure Databricks. Participated in data-masking workshops to identify sensitive data elements and implemented masking logic to ensure data security and compliance. Created Hive tables, developed transformation logic, and loaded processed data into target databases. Implemented secure data movement across UNIX zones to ensure segregation of data between teams. Developed Sqoop scripts to load data from source systems into target databases.
Data Engineer at TCS (CIBC project)
July 1, 2019 - March 1, 2023
Designed and optimized end-to-end ETL pipelines using Azure Databricks, Spark, Delta Lake, and Snowflake. Implemented Snowflake objects (Stored Procedures, Functions, Views, Materialized Views, Indexes) and Snowpipe for automated ingestion. Developed large-scale data pipelines in Azure Data Factory, integrated Azure Key Vault for secrets management, and collaborated with stakeholders to translate mapping documents into accurate source-to-target transformations. Established CI/CD and test-driven deployment practices with Azure DevOps.
Data Engineer at TCS (RBC project)
June 1, 2016 - June 1, 2019
Extensively used Spark (Scala), Shell scripting, and Hive for data processing and analytics. Designed storage architecture balancing cost and performance, and worked on claims and IFRS 17 related data, leveraging a data lake environment. Implemented PySpark-based transformations, Azure Blob Storage-based storage solutions, and Azure Data Factory pipelines for robust data movement. Documented data flows and collaborated with stakeholders to ensure accurate source-to-target transformations.
Big Data Developer at Syntel
June 1, 2011 - May 1, 2016
Used data stored in the data lake to generate quarterly data extracts for external vendors. Designed and developed Hive-based transformations and data schemas for high-traffic portals, focusing on data integrity and query performance. Participated in data-masking workshops to protect sensitive information. Created Hive tables, transformation logic, and loaded processed data into target databases; implemented secure data movement across UNIX zones and developed Sqoop scripts for data ingestion.

Education

B.E. Electronics at Mumbai University
January 11, 2030 - January 8, 2026
B.E. Electronics at Mumbai University, Mumbai
January 11, 2030 - January 8, 2026

Qualifications

AZ-900 – Azure Fundamentals
January 11, 2030 - January 8, 2026
DP-203 – Azure Data Engineer Associate
January 11, 2030 - January 8, 2026
Agile Scrum Master
January 11, 2030 - January 8, 2026
1Z0-007 - Introduction to Oracle 9i SQL
January 11, 2030 - January 8, 2026
Prince2 Project Management
January 11, 2030 - January 8, 2026
AZ-900 – Azure Fundamentals
January 11, 2030 - January 8, 2026
DP-203 – Azure Data Engineer Associate
January 11, 2030 - January 8, 2026
Agile Scrum Master
January 11, 2030 - January 8, 2026
1Z0 – 007 - Introduction to Oracle 9i SQL
January 11, 2030 - January 8, 2026

Industry Experience

Financial Services, Life Sciences, Professional Services, Software & Internet, Other