Hello, I’m Saqeb Farooq—a Principal Data Engineer and Solutions Architect with 8+ years of experience designing, building, and optimizing enterprise-scale data ecosystems across Azure, AWS, and GCP. I leverage modern frameworks like Databricks, Snowflake, Synapse, and Kafka to drive analytical insights and operational efficiency. I’m a results-oriented partner for DataOps, governance, and cost optimization, with a track record of implementing GDPR/HIPAA-compliant data programs and leading Spark performance tuning to reduce cloud spend by up to 50%. I’m passionate about mentoring teams, shaping data strategies, and delivering scalable, reliable data platforms that empower business decisions.

Saqeb Farooq

Hello, I’m Saqeb Farooq—a Principal Data Engineer and Solutions Architect with 8+ years of experience designing, building, and optimizing enterprise-scale data ecosystems across Azure, AWS, and GCP. I leverage modern frameworks like Databricks, Snowflake, Synapse, and Kafka to drive analytical insights and operational efficiency. I’m a results-oriented partner for DataOps, governance, and cost optimization, with a track record of implementing GDPR/HIPAA-compliant data programs and leading Spark performance tuning to reduce cloud spend by up to 50%. I’m passionate about mentoring teams, shaping data strategies, and delivering scalable, reliable data platforms that empower business decisions.

Available to hire

Hello, I’m Saqeb Farooq—a Principal Data Engineer and Solutions Architect with 8+ years of experience designing, building, and optimizing enterprise-scale data ecosystems across Azure, AWS, and GCP. I leverage modern frameworks like Databricks, Snowflake, Synapse, and Kafka to drive analytical insights and operational efficiency.

I’m a results-oriented partner for DataOps, governance, and cost optimization, with a track record of implementing GDPR/HIPAA-compliant data programs and leading Spark performance tuning to reduce cloud spend by up to 50%. I’m passionate about mentoring teams, shaping data strategies, and delivering scalable, reliable data platforms that empower business decisions.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert

Language

English
Fluent

Work Experience

Senior Data Engineer at Confiz
August 1, 2024 - November 21, 2025
Architected and implemented high-volume enterprise ELT pipelines using Azure Synapse, Databricks (Unity Catalog), and Azure Data Factory, achieving a 40% reduction in data latency for critical reporting. Spearheaded real-time streaming data ingestion with Kafka and Azure Stream Analytics for mission-critical financial and healthcare workloads. Championed data governance and observability by integrating Microsoft Purview, enabling complete data lineage and GDPR/HIPAA compliance. Designed and established robust Data Lakehouse architectures on Delta Lake, unifying batch and streaming processing and reducing data duplication. Mentored junior engineers and authored advanced Azure Data Engineering learning materials. Implemented Spark performance tuning and cluster optimization, delivering a 50% cost reduction on Azure compute resources and improved job reliability.
Data Solutions Architect at Capgemini
July 1, 2024 - July 1, 2024
Led architectural design and implementation of scalable Data Lake solutions and ETL pipelines using Hadoop, Hive, and Apache Spark for central processing of multi-structured datasets over 50TB. Automated ingestion and transformation workflows with Apache NiFi and Airflow, embracing DataOps best practices to reduce manual intervention and errors by 30%. Conceptualized hybrid cloud data warehousing combining AWS Redshift and Azure Synapse Analytics to support integrated, cross-platform analytics. Established a comprehensive Data Quality-as-Code framework using Python and Great Expectations, ensuring data integrity and validation across ingestion and staging layers. Collaborated with BI/Analytics teams to optimize data consumption, enabling executive dashboards in Power BI and Tableau.
Data Analytics Specialist at Accenture
May 1, 2022 - May 1, 2022
Engineered sophisticated dimensional data models and ingestion pipelines specifically supporting investment risk and portfolio management analytical platforms. Automated data acquisition from heterogeneous financial systems using Python, REST APIs, and SQL, ensuring faster, reliable data availability for daily market analysis. Designed and implemented governance and metadata management frameworks to ensure full audit readiness and end-to-end data traceability. Conducted deep-dive performance tuning on ETL processes and reporting database layers, leading to a 35% improvement in complex query response times. Delivered executive-level data visualization solutions in Tableau, translating complex financial datasets into actionable intelligence for key investment decision-makers.
Associate Data Engineer at Tkxel
July 1, 2019 - July 1, 2019
Supported the end-to-end development of ETL workflows using Talend and Python for processing both structured relational data and semi-structured log files. Assisted senior architects in migrating legacy on-premise SQL Server database systems to a cloud-native AWS Redshift environment, significantly improving performance and scalability. Executed extensive data validation, profiling, and cleansing operations to maintain high dataset reliability and quality metrics. Collaborated with development teams to automate ingestion and transformation tasks using custom Shell and Python scripting, establishing early CI/CD practices. Authored detailed technical documentation, including data dictionaries and best-practice guides, to standardize data engineering workflows and facilitate team onboarding.

Education

Add your educational history here.

Qualifications

Bachelor's in Computer Science
January 11, 2030 - November 21, 2025

Industry Experience

Financial Services, Healthcare, Professional Services