I am a data engineer with 3 years of experience in designing and implementing scalable data solutions. I specialize in batch and real-time data pipelines using Spark, Hadoop, and Kafka, with cloud implementation skills on AWS and Azure. I optimize ETL processes, migrate legacy systems to modern data platforms, and enable data-driven decision making through robust data warehousing and visualization solutions. I enjoy collaborating with cross-functional teams to deliver high-performance data infrastructure that meets business objectives. I am passionate about building maintainable data architectures and driving measurable improvements in efficiency and insight.

Mohammed Arbaaz Khan

I am a data engineer with 3 years of experience in designing and implementing scalable data solutions. I specialize in batch and real-time data pipelines using Spark, Hadoop, and Kafka, with cloud implementation skills on AWS and Azure. I optimize ETL processes, migrate legacy systems to modern data platforms, and enable data-driven decision making through robust data warehousing and visualization solutions. I enjoy collaborating with cross-functional teams to deliver high-performance data infrastructure that meets business objectives. I am passionate about building maintainable data architectures and driving measurable improvements in efficiency and insight.

Available to hire

I am a data engineer with 3 years of experience in designing and implementing scalable data solutions. I specialize in batch and real-time data pipelines using Spark, Hadoop, and Kafka, with cloud implementation skills on AWS and Azure. I optimize ETL processes, migrate legacy systems to modern data platforms, and enable data-driven decision making through robust data warehousing and visualization solutions. I enjoy collaborating with cross-functional teams to deliver high-performance data infrastructure that meets business objectives. I am passionate about building maintainable data architectures and driving measurable improvements in efficiency and insight.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

Data Engineer at Avita Solutions
August 1, 2024 - Present
Designed and optimized AWS Redshift data warehouses using star schema and snowflake modeling, improving query performance for financial reporting by 35% and reducing costs through WLM queue and sort key optimizations. Led the end-to-end migration of 15+ TB of financial data from Oracle/PostgreSQL to Azure Data Lake using PySpark and Delta Lake, implementing SCD Type 2 for historical tracking and reducing storage costs by 40% with ZSTD compression. Developed batch processing pipelines on Hadoop (HDFS, Hive) to transform and analyze daily transactions, enabling 15% faster monthly closings through automated reconciliation reports. Built real-time data pipelines using Kafka and Spark Streaming to process 2M+ transactions per day for fraud detection, reducing response time to under 5 minutes. Automated legacy SSIS workflows by converting them to PySpark jobs with Airflow orchestration, reducing manual errors by 60% and achieving 99.9% pipeline uptime. Created Tableau dashboards querying on
Data Engineer at CMC LTD
August 1, 2021 - April 1, 2023
Designed and implemented a high-performance AWS Redshift data warehouse with optimized Star Schema and Data Vault architectures for financial data analysis, improving reporting speed by 35%. Led migration of legacy Oracle and PostgreSQL databases to Azure Data Lake using IBM DataStage, ensuring SOX compliance. Processed and analyzed batch financial data using Hadoop (HDFS, Hive) to create accurate financial forecasts, contributing to a 15% increase in decision-making efficiency for senior leadership. Automated SSIS workflows to consolidate and transform financial data from multiple legacy systems, reducing manual intervention by 50% and improving data pipeline reliability. Optimized PySpark jobs for data transformation in financial modeling, accelerating ETL processing times by 40% and enabling real-time reporting for critical financial insights. Developed Tableau dashboards to visualize customer sentiment trends, enabling stakeholders to identify actionable insights and driving a 15%
Data Engineer at CMC LTD, India
August 1, 2021 - April 1, 2023
Designed and implemented a high-performance AWS Redshift data warehouse with optimized Star Schema and Data Vault architectures for financial data analysis, improving reporting speed by 35%. Led the migration of legacy Oracle and PostgreSQL databases to Azure Data Lake, using IBM DataStage, ensuring compliance with SOX and other financial data standards. Processed and analyzed batch financial data using Hadoop (HDFS, Hive) to create accurate financial forecasts, contributing to a 15% increase in decision-making efficiency for senior leadership. Automated SSIS workflows to consolidate and transform financial data from multiple legacy systems, reducing manual intervention by 50% and improving data pipeline reliability. Optimized PySpark jobs for data transformation in financial modeling, accelerating ETL processing times by 40% and enabling real-time reporting for critical financial insights. Developed Tableau dashboards to visualize customer sentiment trends, enabling stakeholders to id

Education

Postgraduate Diploma at Conestoga College
May 1, 2024 - December 1, 2024
Postgraduate Diploma at Conestoga College
May 1, 2023 - December 1, 2023
Bachelor of Technology at Lords Institute of Engineering & Technology, Hyderabad, India
July 1, 2018 - August 1, 2022
Post Graduation in Business Analyst at Conestoga college, Kitchener
May 1, 2024 - December 1, 2024
Post Graduation in Big Data at Conestoga college, Kitchener
May 1, 2023 - December 1, 2023
Bachelor of Technology at Lords Institute of Engineering & Technology, Hyderabad, India
July 1, 2018 - August 1, 2022
Post Graduation in Business Analyst at Conestoga College, Kitchener
May 1, 2024 - December 1, 2024
Post Graduation in Big Data at Conestoga College, Kitchener
May 1, 2023 - December 1, 2023
Bachelor of Technology at Lords Institute of Engineering & Technology, Hyderabad, India
July 1, 2018 - August 1, 2022
Post Graduation in Business Analyst at Conestoga College
May 1, 2024 - December 1, 2024
Post Graduation in Big Data at Conestoga College
May 1, 2023 - December 1, 2023
Bachelor of Technology at Lords Institute of Engineering & Technology
July 1, 2018 - August 1, 2022

Qualifications

Big Data, Power BI Essential Training - LinkedIn
January 11, 2030 - January 12, 2026
DevOps Professional Certificate - PagerDuty and LinkedIn
January 11, 2030 - January 12, 2026
Big Data, Power BI Essential Training - LinkedIn
January 11, 2030 - January 12, 2026
DevOps Professional Certificate - PagerDuty and LinkedIn
January 11, 2030 - January 12, 2026
Big Data
January 11, 2030 - January 12, 2026
Power BI Essential Training
January 11, 2030 - January 12, 2026

Industry Experience

Software & Internet, Financial Services, Professional Services, Media & Entertainment, Education