Available to hire
Hi, I’m Avinash Jakka, a results-driven Data Engineer with 11+ years of experience designing, developing, and deploying scalable data solutions across AWS, Azure, and GCP. I specialize in building robust ELT/ETL pipelines, modern data lake and warehousing architectures, and data-driven ML workflows in agile environments.
I collaborate with product owners and analysts to translate requirements into scalable data platforms, optimize performance and cost, and mentor junior engineers to raise engineering standards. When I’m not building data pipelines, I enjoy exploring new cloud-native patterns and contributing to open source projects.
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Language
English
Fluent
Work Experience
Azure Data Engineer at Bank of Montreal
August 1, 2025 - October 31, 2025Designed and implemented ETL/ELT pipelines for real-time and batch data integration using Azure Data Factory, Azure Functions, Kubernetes, and Docker. Led the design of enterprise-scale data ecosystems across Snowflake, Hadoop, Oracle, and PostgreSQL, ensuring governance and architecture alignment. Performed data ingestion, transformation, and analysis with Azure Data Lake, Azure SQL DB, and Synapse Analytics. Built real-time streaming solutions using Kafka, Azure Event Hubs, and PySpark Structured Streaming. Automated data extraction from HBase/Hadoop sources and optimized pipelines with Databricks. Implemented data governance, lineage, and quality frameworks, and established CI/CD with Azure DevOps, Jenkins, Docker, and Terraform. Led a cloud migration initiative moving 20TB+ data with minimal downtime.
Azure Data Engineer at HSBC (via Wipro)
June 1, 2025 - June 1, 2025Architected modern data ingestion frameworks using Azure Data Factory, Databricks, and Event Hubs; migrated from legacy ETL systems to real-time analytics. Implemented Delta Lake-based data lake for ACID transactions and incremental updates. Automated migrations with ADF triggers and Python-based orchestration. Built real-time data pipelines connecting Databricks to Synapse for high-throughput reporting; deployed ML workflows in Databricks with MLflow. Standardized CI/CD and IaC with Azure DevOps and Terraform; implemented data lineage and governance with Purview.
GCP Data Engineer at Nectar Lifesciences Limited
August 1, 2020 - August 1, 2020Designed and deployed scalable ETL/ELT pipelines on Google Cloud Platform (BigQuery, Cloud Storage, Dataflow, Dataproc). Processed 5TB+ daily, reducing transformation time by 40%. Built ingestion flows from Pub/Sub, APIs, Oracle, and Teradata; orchestrated with Airflow/Cloud Composer. Delivered analytics-ready datasets and integrated with Data Studio/Tableau/Looker for reporting. Enabled MLOps workflows with Vertex AI and Dataproc ML for model deployment and feature engineering.
Senior Data Engineer at Global Logic Technologies / Patanjali Ayurved
April 1, 2014 - August 28, 2017Designed batch and near real-time data pipelines using Hadoop, Spark, Hive, and Kafka; ingested data from on-prem sources into Redshift/BigQuery and cloud-native ecosystems. Coordinated with Oozie, Airflow, and Cron-based scheduling; delivered analytics-ready datasets for BI platforms. Collaborated with data scientists on preprocessing, feature engineering, and model integration; performed cluster tuning and optimization for reliability and performance during hybrid cloud migrations.
Azure Data Engineer at Bank of Montreal
August 1, 2025 - November 7, 2025Designed and developed real-time and batch data integration pipelines using Azure Data Factory, Azure Functions, Kubernetes, and Docker to ensure scalable processing and reliability. Led architecture discussions for enterprise data ecosystems across Snowflake, Hadoop, Oracle, and PostgreSQL. Implemented ingestion/transformation of large datasets via Azure Data Lake, Azure SQL DB, Synapse Analytics, Hive, and Pig. Built streaming solutions using Kafka, Azure Event Hubs, and Apache Spark (PySpark/Structured Streaming) for low-latency analytics. Automated data extraction from HBase and Hadoop with Python/PySpark and Azure Databricks. Implemented Redis caching for high-speed analytics. Utilized Informatica for data quality and orchestrated CI/CD with Azure DevOps, Jenkins, Docker, and Terraform. Drove data migration from on-prem Hadoop to cloud (Dataproc/BigQuery) with 20TB transfer, achieving zero data loss. Optimized Snowflake and Synapse performance and cost. Collaborated with stakehold
Azure Data Engineer at HSBC
June 1, 2025 - June 1, 2025Designed and developed ETL pipelines with AWS Glue, Lambda, S3, Sqoop, and Flume, integrating data into HDFS and AWS S3 with real-time validation and transformation. Built/optimized Hive external tables and Redshift data warehouses; achieved ~30% reduction in processing time through schema optimization. Developed Spark (Scala/PySpark) pipelines for large-scale transformations and analytics. Automated data migration/orchestration using Airflow, Python, and cloud services across AWS/GCP, ensuring data consistency, reduced downtime, and seamless cutover. Implemented data quality frameworks with Informatica and Python-based validation. Established CI/CD pipelines with AWS CodePipeline, Terraform, and Docker. Tuned Elasticsearch clusters for log analytics. Integrated ML workflows with AWS (SageMaker) for feature engineering and model deployment. Advocated Agile/DevOps practices and data governance across teams; improved data lineage, metadata, and quality monitoring. Optimized SQL performan
Senior Data Engineer at Global Logic Technologies / Patanjali Ayurved
August 1, 2017 - August 1, 2017Designed batch and near real-time data pipelines using Spark, Hadoop (HDFS/MapReduce), Hive, and Kafka to process high-volume data. Built ETL pipelines from on-prem SQL Server/Oracle into Redshift, BigQuery, and HDFS. Orchestrated Oozie, Airflow, and Cron for reliability and scheduling. Developed analytics-ready datasets for warehouses/BI tools (BigQuery, Tableau, Power BI). Collaborated with data scientists on ML preprocessing, feature engineering, and model integration. Implemented cluster monitoring/performance tuning; supported early cloud migration initiatives blending on-premise Hadoop with cloud services.
Education
Bachelor of Engineering in Electronics and Communication Engineering at JNTU Hyderabad, India
July 11, 2008 - July 15, 2011Bachelor's in Electronics and Communication Engineering at JNTU Hyderabad, India
January 11, 2030 - November 7, 2025Qualifications
Industry Experience
Financial Services, Software & Internet, Professional Services, Other
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Toronto today.