Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I'm a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets. I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.…I'm a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets. I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.

Prabhakar Jagtap

Data Scientist, Data Analyst, Full Stack Developer, +3





I'm a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets. I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.…I'm a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets. I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.

Available to hire

I’m a Senior Data Engineer with 14+ years of experience delivering big data, Hadoop, Spark, and cloud-based data platforms. I design, build, and operate end-to-end data pipelines and analytics solutions across Banking, Insurance, and Life Sciences, translating complex data into reliable, analytics-ready datasets.

I specialize in modern data stacks (Snowflake, Airflow, PySpark, AWS, Fivetran, dbt) and excel at data quality, observability, governance, and scalable cloud-native architectures. I have led legacy-to-cloud migrations, optimized performance and cost, and collaborated closely with BI and analytics teams to enable data-driven decision making.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Language

English

Fluent

Work Experience

Sr. Data Engineer at I Cube IT Systems Ltd.

April 1, 2023 - Present

Designed, built, and maintained scalable ELT pipelines using Snowflake, Apache Airflow, PySpark, AWS, Fivetran, and dbt, supporting both real-time and batch analytics workloads. Implemented data quality, observability, and reliability frameworks, proactively monitoring pipeline health, data freshness, schema changes, and anomaly detection. Automated data ingestion and integration pipelines using Fivetran, reducing manual intervention and accelerating data availability for analytics and reporting. Developed and optimized dbt transformation models, enabling analytics-ready datasets, improved data lineage, and standardized business logic across teams. Optimized Snowflake data warehouse performance and cost efficiency through partitioning strategies, clustering, query tuning, and workload isolation. Built and maintained PySpark-based distributed processing pipelines to handle large-scale datasets efficiently within cloud environments. Collaborated with BI, analytics, and architecture teams

Data Engineer at TCS, CIBC, Toronto

July 1, 2019 - March 1, 2023

Designed and optimized end-to-end ETL pipelines using Azure Databricks, Apache Spark, and Delta Lake; load data into Azure Data Lake Gen2; implemented Snowpipe for automated data ingestion and utilized Snowflake Clone and Time Travel features. Built and supported large-scale ETL pipelines in Azure Data Factory for reliable data movement. Configured Linked Services, Datasets, and Integration Runtimes to connect multiple data sources and load data into data stores. Implemented security and governance through Databricks Unity Catalog and integrated Azure Key Vault for secrets management. Collaborated with DevOps teams to implement automated CI/CD and test-driven deployment pipelines using Azure DevOps. Partnered with business stakeholders to translate mapping documents into accurate source-to-target transformations.

Data Engineer at TCS, RBC, Canada

June 1, 2016 - June 1, 2019

Extensively used Spark datasets, Scala, Shell scripting, and Hive for data processing and analytics. Designed an optimal storage architecture balancing storage cost and computing performance. Gained strong understanding of insurance claims, admin systems, and IFRS 17 requirements and implemented them using technical expertise within the data lake. Utilized PySpark, Anaconda Python, and Pandas for large-scale data transformation in the Hortonworks Data Lake. Designed and implemented efficient database solutions using Azure Blob Storage for secure data storage and retrieval. Prepared comprehensive documentation and analytical reports for stakeholders. Deployed Azure Data Factory to build and orchestrate data pipelines for loading data into SQL databases.

Big Data Developer at Syntel, India

June 1, 2011 - May 1, 2016

Utilized data stored in the data lake to generate quarterly data extracts for external vendors. Selected appropriate storage formats and developed ad-hoc queries and Hive routines for analytical purposes. Designed and developed new databases and data schemas for high-profile, customer-facing portals with a strong focus on data integrity and query performance using Azure Databricks. Participated in data-masking workshops to identify sensitive data elements and implemented masking logic to ensure data security and compliance. Created Hive tables, developed transformation logic, and loaded processed data into target databases. Implemented secure data movement across UNIX zones to ensure segregation of data between teams. Developed Sqoop scripts to load data from source systems into target databases.

Data Engineer at TCS (CIBC project)

July 1, 2019 - March 1, 2023

Designed and optimized end-to-end ETL pipelines using Azure Databricks, Spark, Delta Lake, and Snowflake. Implemented Snowflake objects (Stored Procedures, Functions, Views, Materialized Views, Indexes) and Snowpipe for automated ingestion. Developed large-scale data pipelines in Azure Data Factory, integrated Azure Key Vault for secrets management, and collaborated with stakeholders to translate mapping documents into accurate source-to-target transformations. Established CI/CD and test-driven deployment practices with Azure DevOps.

Data Engineer at TCS (RBC project)

June 1, 2016 - June 1, 2019

Extensively used Spark (Scala), Shell scripting, and Hive for data processing and analytics. Designed storage architecture balancing cost and performance, and worked on claims and IFRS 17 related data, leveraging a data lake environment. Implemented PySpark-based transformations, Azure Blob Storage-based storage solutions, and Azure Data Factory pipelines for robust data movement. Documented data flows and collaborated with stakeholders to ensure accurate source-to-target transformations.

Big Data Developer at Syntel

June 1, 2011 - May 1, 2016

Used data stored in the data lake to generate quarterly data extracts for external vendors. Designed and developed Hive-based transformations and data schemas for high-traffic portals, focusing on data integrity and query performance. Participated in data-masking workshops to protect sensitive information. Created Hive tables, transformation logic, and loaded processed data into target databases; implemented secure data movement across UNIX zones and developed Sqoop scripts for data ingestion.