I am a results-oriented data engineer with over 11 years of experience designing, building, and optimizing enterprise-grade data solutions across cloud and big data environments. I specialize in Azure Data Factory, Azure Databricks, Synapse Analytics, and Spark, with hands-on experience in Python, SQL, and large-scale ETL frameworks, delivering scalable architectures and performance optimization. I have led cross-functional teams, managed end-to-end data pipelines, and partnered with stakeholders in banking, finance, and enterprise domains to enable data-driven decision making. I am experienced in migrating legacy systems, implementing data governance, and collaborating with data scientists on generative AI use cases, including Retrieval-Augmented Generation (RAG) chat solutions.

Pinku Pradhan

I am a results-oriented data engineer with over 11 years of experience designing, building, and optimizing enterprise-grade data solutions across cloud and big data environments. I specialize in Azure Data Factory, Azure Databricks, Synapse Analytics, and Spark, with hands-on experience in Python, SQL, and large-scale ETL frameworks, delivering scalable architectures and performance optimization. I have led cross-functional teams, managed end-to-end data pipelines, and partnered with stakeholders in banking, finance, and enterprise domains to enable data-driven decision making. I am experienced in migrating legacy systems, implementing data governance, and collaborating with data scientists on generative AI use cases, including Retrieval-Augmented Generation (RAG) chat solutions.

Available to hire

I am a results-oriented data engineer with over 11 years of experience designing, building, and optimizing enterprise-grade data solutions across cloud and big data environments. I specialize in Azure Data Factory, Azure Databricks, Synapse Analytics, and Spark, with hands-on experience in Python, SQL, and large-scale ETL frameworks, delivering scalable architectures and performance optimization.

I have led cross-functional teams, managed end-to-end data pipelines, and partnered with stakeholders in banking, finance, and enterprise domains to enable data-driven decision making. I am experienced in migrating legacy systems, implementing data governance, and collaborating with data scientists on generative AI use cases, including Retrieval-Augmented Generation (RAG) chat solutions.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate

Language

English
Fluent

Work Experience

Lead Data Engineer at Infosys Ltd.
February 1, 2022 - October 31, 2025
Led scalable data pipelines using Azure Data Factory, Azure Databricks, and Synapse Analytics, integrating structured and unstructured data from diverse sources. Achieved a 30% improvement in data processing efficiency by integrating Ogg 23AI and orchestrating ingestion workflows via Databricks, optimizing real-time and batch processing. Collaborated with the data science team on Generative AI chat use case, contributing to data pipeline design and integration for scalable, context-aware AI responses. Led performance tuning initiatives that delivered a 3x increase in query execution speed, enhancing overall system responsiveness and throughput. Directed and mentored a cross-functional team of data engineers and developers, ensuring alignment with Agile methodologies, coding standards, and delivery timelines. Partnered with business stakeholders to translate complex requirements into technical specifications, driving data-driven decision-making. Established and enforced data governance,
Senior Big Data Engineer at Infosys Ltd.
February 1, 2022 - February 1, 2022
Developed ETL pipelines using Oracle Data Integrator (ODI 12c) to integrate data from heterogeneous sources into a centralized Oracle Exadata data warehouse. Supported and optimized data ingestion from diverse sources leveraging Oracle GoldenGate (OGG), Apache Kafka, UC4, and Apache NiFi into HDFS-based data lake. Engineered a scalable data movement framework from raw to foundation layers in HDFS, leveraging ODI for ETL orchestration and Hive for distributed querying. Implemented a hybrid data architecture combining HDFS-based Data Lake for advanced analytics and Oracle Exadata for enterprise reporting. Enhanced batch processing performance by 25% through Spark-ODI integration, significantly reducing latency and improving throughput. Led code reviews, enforced coding standards, and ensured high-quality deliverables. Optimized SQL queries, ODI mappings, and procedures to improve system efficiency and maintainability. Achieved a 30% reduction in storage costs on Oracle Exadata by applyin
ETL Developer at Tata Consultancy Services
August 1, 2017 - August 1, 2017
Migrated complex legacy SAS workflows into high-performance Ab Initio graphs, improving code maintainability and processing efficiency for a major banking client. Designed and executed comprehensive unit and system test cases, incorporating client-driven enhancements and ensuring defect-free delivery aligned with business requirements. Managed code migration across SIT and UAT environments, including deployment via Autosys scheduler, and resolved production issues within defined SLA timelines, ensuring uninterrupted data flow. Delivered ad hoc data solutions for business stakeholders, conducted offshore deliverable reviews, and ensured quality assurance and timely deployment of Ab Initio artifacts across environments.
Lead Data Engineer at Infosys Ltd.
February 1, 2022 - November 7, 2025
Led the implementation of scalable data pipelines using Azure Data Factory, Azure Databricks, and Synapse Analytics, integrating structured and unstructured data from diverse sources. Improved data processing efficiency by 30% by integrating Ogg 23AI and orchestrating ingestion workflows via Databricks, optimizing both real-time and batch processing. Collaborated with the data science team on a Generative AI chat use case, contributing to data pipeline design and integration strategies for scalable, context-aware AI responses. Directed performance-tuning initiatives delivering a 3x increase in query execution speed, and mentored a cross-functional team of data engineers and developers. Established data governance, security, and compliance best practices and achieved a 30% reduction in storage costs through strategic storage optimization.
Senior Big Data Engineer at Infosys Ltd., Bhubaneswar
February 1, 2022 - February 1, 2022
Developed ETL pipelines using ODI 12c to integrate data from heterogeneous sources into a centralized Oracle Exadata data warehouse. Supported data ingestion from Oracle GoldenGate (OGG), Apache Kafka, UC4, and Apache NiFi into an HDFS-based data lake. Engineered a scalable data movement framework from raw to foundation layers in HDFS, leveraging ODI for ETL orchestration and Hive for distributed querying. Implemented a hybrid data architecture combining HDFS-based data lake analytics and Oracle Exadata for enterprise reporting, supporting both data scientists and business users. Enhanced batch processing performance by 25% through Spark-ODI integration and achieved a 30% reduction in storage costs on Exadata via compression and partitioning. Led compute optimization for ODI batch jobs, reducing execution time by 40%.
Lead Data Engineer at Infosys Ltd.
February 1, 2022 - November 22, 2025
Led scalable data pipelines using Azure Data Factory, Azure Databricks, and Synapse Analytics, integrating structured and unstructured data from diverse sources. Achieved a 30% improvement in data processing efficiency by integrating Ogg 23AI and orchestrating ingestion workflows via Databricks, optimizing both real-time and batch processing. Collaborated with the data science team on Generative AI chat use cases, contributed to data pipeline design and integration strategies for scalable, context-aware AI responses. Led performance tuning initiatives delivering a 3x increase in query execution speed, and established data governance, security, and cost-optimization best practices. Executed a strategic storage optimization initiative, resulting in a 30% reduction in data storage costs.

Education

Bachelor of Technology in Computer Science & Engineering at College of Engineering and Technology, Bhubaneswar, India
January 1, 2010 - January 1, 2014
Diploma in Computer Science & Engineering at UCP Engineering School, Berhampur, India
January 1, 2009 - January 1, 2011
Bachelor of Technology in Computer Science & Engineering at College of Engineering and Technology, Bhubaneswar, India
January 11, 2030 - January 1, 2014
Diploma in Computer Science & Engineering at UCP Engineering School, Berhampur, India
January 11, 2030 - January 1, 2011
Bachelor of Technology in Computer Science & Engineering at College of Engineering and Technology, Bhubaneswar, India
January 11, 2030 - January 1, 2014
Diploma in Computer Science & Engineering at UCP Engineering School, Berhampur, India
January 11, 2030 - January 1, 2011
Diploma in Computer Science & Engineering at UCP Engineering School, Berhampur, India
January 11, 2030 - January 1, 2011
Bachelor of Technology in Computer Science & Engineering at College of Engineering and Technology, Bhubaneswar, India
January 11, 2030 - January 1, 2014

Qualifications

Google Cloud Certified Generative AI Leader
January 11, 2030 - October 31, 2025
Microsoft Certified: Azure Data Fundamentals
January 11, 2030 - October 31, 2025
Google Cloud Certified Generative AI Leader
January 11, 2030 - November 7, 2025
Microsoft Certified Azure Data Fundamentals
January 11, 2030 - November 7, 2025
Google cloud certified Generative AI Leader
January 11, 2030 - November 22, 2025
Microsoft Certified Azure Data Fundamentals
January 11, 2030 - November 22, 2025
Google Cloud Certified Generative AI Leader
January 11, 2030 - November 22, 2025
Microsoft Certified Azure Data Fundamentals
January 11, 2030 - November 22, 2025

Industry Experience

Financial Services, Professional Services, Software & Internet