Available to hire
I am a data engineer with 4+ years of experience designing, building, and optimizing scalable data pipelines with a strong focus on reliability, performance, and maintainability across diverse industries.
I am proficient in data ingestion, ETL/ELT workflows, and architecting resilient data solutions using frameworks like the Medallion Architecture. I am skilled in Databricks, PySpark, Delta Lake, and building Spark-based pipelines on AWS EMR with Airflow. I also have experience with CI/CD automation, OLTP/OLAP data modeling, and Power BI dashboards, and I enjoy contributing to AI-driven initiatives.
Skills
Work Experience
Data Engineer at Cloud Enterprise Business Solutions (CEBS)
January 1, 2025 - January 1, 2025Migrated MATAS (D365 F&O) to Synapse Link in 2 months; built scalable ADF pipelines ingesting ~1 TB/day into ADLS Gen2, reducing latency by 40% and optimized existing ADF, reducing runtime by 73% (from 30 minutes to 8 minutes). Built CI/CD in Azure DevOps to deploy 50+ ADF pipelines. Developed and maintained a Medallion architecture with optimized PySpark in Databricks, leveraging Auto Loader, Unity Catalog, and Delta Live Tables for real-time and batch processing, and orchestrated 100+ workflows using Databricks Asset Bundles deployed via CI/CD.
Big Data Engineer at NOWASY S LTD
December 1, 2023 - December 1, 2023Developed 15 ingestion PySpark pipelines on AWS EMR, ingesting 10–15 TB daily into an S3 data lake. Orchestrated workflows via Airflow. Developed/deployed a PySpark-based DQ Framework, reducing data quality errors by 99%. Optimized Spark code and AWS EMR configurations, improving performance and resource utilization by 50%. Implemented automated backfill mechanism for batch pipelines to reduce data loss.
Big Data Engineer at The Entertainer, Lahore
February 1, 2023 - February 1, 2023Optimised Azure Synapse DWH to support cross-functional teams, reducing ad-hoc query time by 5% and dashboard reporting latency by 20%. Defined a standardized data modeling approach (Kimball) for DWH; the approach now serves as a blueprint for 10+ data engineers. Built and monitored 50+ ELT pipelines in Azure Data Factory, ingesting data into fact and dimension tables with watermarking for incremental loads. Developed PySpark pipelines on Databricks to process and transform 50M+/day web/app logs, loading into the DWH to improve recommendation system accuracy by 15%.
Data Engineer at Afiniti
May 1, 2022 - May 1, 2022Designed and implemented data pipelines for port, vehicle and broadband data serving both US and UK markets, processing 10M+ records daily. Engineered and optimized processes using multiprocessing and multithreading, improving performance by 30% on 100K+ tasks. Built web scraping engine to gather data, reducing data acquisition time by 40%.
Education
Master of Science in Big Data and Data Science at Northumbria University, London
January 1, 2025 - January 1, 2026Bachelor of Science in Computer Science at COMSATS University Islamabad, Pakistan
January 1, 2017 - January 1, 2021Qualifications
Data Engineering Nanodegree
January 11, 2030 - November 7, 2025Microsoft Azure Databricks for Data Engineering
January 11, 2030 - November 7, 2025Introduction to Big Data with Spark and Hadoop (Coursera | IBM)
January 11, 2030 - November 7, 2025ETL and Data Pipelines with Shell, Airflow and Kafka (Coursera | IBM)
January 11, 2030 - November 7, 2025Introduction to Bash Shell Scripting
January 11, 2030 - November 7, 2025Apache Spark Essential Training: Big Data Engineering (LinkedIn)
January 11, 2030 - November 7, 2025Advanced Python (LinkedIn)
January 11, 2030 - November 7, 2025Advanced SQL for Query Tuning and Performance Optimization (LinkedIn)
January 11, 2030 - November 7, 2025Industry Experience
Software & Internet, Professional Services, Media & Entertainment
Skills
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in London today.