I am a data engineer with 5+ years of hands-on experience building scalable data pipelines using Spark, Hadoop, and cloud platforms. I design and implement end-to-end data architectures, from data ingestion to analytics-ready datasets, including lakehouses and streaming solutions for real-time insights. I collaborate with product and analytics teams to optimize performance, ensure data governance and security, reduce costs, and empower self-service analytics through governed datasets and dashboards.

Irshad Ahmed

I am a data engineer with 5+ years of hands-on experience building scalable data pipelines using Spark, Hadoop, and cloud platforms. I design and implement end-to-end data architectures, from data ingestion to analytics-ready datasets, including lakehouses and streaming solutions for real-time insights. I collaborate with product and analytics teams to optimize performance, ensure data governance and security, reduce costs, and empower self-service analytics through governed datasets and dashboards.

Available to hire

I am a data engineer with 5+ years of hands-on experience building scalable data pipelines using Spark, Hadoop, and cloud platforms. I design and implement end-to-end data architectures, from data ingestion to analytics-ready datasets, including lakehouses and streaming solutions for real-time insights.

I collaborate with product and analytics teams to optimize performance, ensure data governance and security, reduce costs, and empower self-service analytics through governed datasets and dashboards.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Language

English
Fluent

Work Experience

Data Engineer at CITI Bank
May 1, 2022 - November 4, 2025
Designed and maintained robust data pipelines using Azure Data Factory, Databricks, and PySpark for healthcare and financial data; led monolith-to-microservices database transformation; implemented data governance, observability, and data anonymization; built CI/CD pipelines and containerized workflows; established data models and dashboards for stakeholders.
Data Engineer at HSBC
May 1, 2022 - May 1, 2022
Built scalable ETL pipelines using AWS Glue, Redshift, and Spark; modernization of Teradata data warehouse to AWS Redshift; implemented data masking and anonymization; developed modular data marts with dbt and Redshift; integrated Kafka with Spark Streaming for real-time ingestion; migrated workloads to Snowflake; established CI/CD integration and governance.
Data Engineer at Citi Bank
May 1, 2022 - November 21, 2025
Designed and maintained robust data pipelines using Azure Data Factory, Databricks, and PySpark to ingest, transform, and orchestrate healthcare and financial data. Engineered synthetic data generation and data anonymization to enable analytics on sensitive datasets in compliance with enterprise security standards. Collaborated with data scientists, analysts, and architects to build data products supporting predictive models and operational reporting. Led monolith-to-microservices database transformations, decomposing centralized data stores into domain-aligned schemas to enable independent service ownership. Developed CI/CD pipelines (Azure DevOps, Jenkins) and containerized workflows with Docker and Kubernetes. Implemented automated alerting and retry logic in Airflow for production reliability. Created domain-driven data models (star/snowflake) and modular marts using dbt and Azure Synapse. Established data governance for cataloging, lineage, retention, and access control to comply
Sr. Data Engineer at Citi Bank
May 1, 2022 - Present
Migrated data from SQL Server to Amazon Redshift with Spark transformations; built end-to-end data pipelines using Hadoop and Apache Spark for large-scale processing; developed serverless ETL with AWS Lambda; designed Lakehouse architecture using Delta Lake on Databricks with ACID transactions and schema evolution; created Hive UDFs; managed Hadoop components; built Airflow DAGs for orchestrating ETL; implemented streaming with Spark Streaming using AWS Kinesis; integrated S3, Delta Lake, and Redshift Spectrum for hybrid queries; implemented data quality checks and data observability dashboards; led data engineering initiatives and mentored junior engineers.
Data Engineer at HSBC
November 1, 2019 - May 1, 2022
Designed and delivered Hadoop-based big data analytics solutions; built and maintained enterprise data solutions using Azure Data Factory, ADLS Gen2, Azure SQL DB, and Azure Synapse Analytics; developed PySpark/Spark SQL-based ETL pipelines; implemented CI/CD for ADF and Databricks; built streaming pipelines with Kafka/Event Hubs; implemented near-real-time ingestion; performed data mining and visualization; designed data lake architectures and NoSQL HBase tables; configured Databricks clusters and Oozie workflows; built BI solutions with Power BI and Analysis Services.

Education

Add your educational history here.

Qualifications

Microsoft Certified: Azure Developer Associate (AZ-204)
January 11, 2030 - November 4, 2025
Microsoft Certified: Azure Developer Associate (AZ-204)
January 11, 2030 - November 21, 2025

Industry Experience

Financial Services, Healthcare, Professional Services, Computers & Electronics, Software & Internet