I have 4 years of hands-on experience working with data warehouses and data lakes. I've worked extensively with Azure Data Lake, Azure Synapse Analytics, AWS Redshift, and Databricks. My experience includes building ETL pipelines using Apache Airflow, Python, and PySpark to process 50TB+ datasets, implementing data governance with Collibra, and integrating various data sources for real-time analytics. I've supported enterprise-level projects at AAA and Kaleo Pharma, consistently delivering scalable data solutions that reduced processing times and enabled data-driven decision making. While I have 4 years of experience rather than the 7-8 mentioned, my work has been comprehensive and impactful across the full data engineering lifecycle.

Hussain Mohammad

I have 4 years of hands-on experience working with data warehouses and data lakes. I've worked extensively with Azure Data Lake, Azure Synapse Analytics, AWS Redshift, and Databricks. My experience includes building ETL pipelines using Apache Airflow, Python, and PySpark to process 50TB+ datasets, implementing data governance with Collibra, and integrating various data sources for real-time analytics. I've supported enterprise-level projects at AAA and Kaleo Pharma, consistently delivering scalable data solutions that reduced processing times and enabled data-driven decision making. While I have 4 years of experience rather than the 7-8 mentioned, my work has been comprehensive and impactful across the full data engineering lifecycle.

Available to hire

I have 4 years of hands-on experience working with data warehouses and data lakes. I’ve worked extensively with Azure Data Lake, Azure Synapse Analytics, AWS Redshift, and Databricks. My experience includes building ETL pipelines using Apache Airflow, Python, and PySpark to process 50TB+ datasets, implementing data governance with Collibra, and integrating various data sources for real-time analytics. I’ve supported enterprise-level projects at AAA and Kaleo Pharma, consistently delivering scalable data solutions that reduced processing times and enabled data-driven decision making. While I have 4 years of experience rather than the 7-8 mentioned, my work has been comprehensive and impactful across the full data engineering lifecycle.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Language

English
Fluent

Work Experience

Data Engineer at A AA (Contract - Knowac IT)
August 1, 2025 - August 1, 2025
Reduced reporting time by 65% by architecting end-to-end data pipelines that process 200M+ daily records with 98.2% uptime, enabling executives to access critical business metrics in near real-time. Eliminated data silos across 12+ enterprise systems by implementing a collaborative Data Governance framework, improving lineage tracking by 75% and preventing multiple critical reporting errors that previously cost the company $250K annually. Accelerated large-scale analytics from hours to minutes by optimizing Hive SQL processing for 50TB+ datasets, enabling same-day insights during peak sales periods.
Data Engineer at Kaleo Pharma (Contract - Knowac IT)
November 1, 2023 - November 1, 2023
Built real-time data pipelines processing millions of daily records using Apache Airflow, Python, and AWS Redshift, achieving 98.2% uptime and reducing reporting delays from days to hours. Developed seamless API integration between Salesforce and AWS to automate real-time data extraction, improving data freshness by 40%. Resolved persistent data quality issues via validation frameworks and master data mapping, reducing integration errors by 65%. Designed optimized database schemas and ETL procedures, reducing processing time by 72% and enabling daily runs without extra infra costs. Built end-to-end analytics solutions by transforming raw data into actionable dashboards, boosting cross-functional operational efficiency by 25%. Automated data cleansing workflows with Python, NumPy, and Pandas, processing millions of records daily and improving data accuracy by 85%. Led architectural improvements to data systems, enhancing query performance by 3x and lowering storage costs.
Jr. Data Scientist at DATA FACTZ
November 1, 2021 - November 1, 2021
Built real-time analytics pipelines using Spark and Python to process 500GB+ daily data, reducing processing time and infrastructure costs. Improved forecasting accuracy by 42% using LSTM neural networks for time-series analysis, contributing to better inventory planning and waste reduction. Migrated job orchestration from Cron to Apache NiFi, improving pipeline reliability. Deployed ML models (Random Forest, XGBoost, SVM) via Flask REST API for customer churn prediction.
Data Engineer at AAA (Contract - Knowac IT)
August 1, 2025 - August 1, 2025
Architected end-to-end data pipelines that processed 200M+ daily records with 98.2% uptime, enabling executives to access critical business metrics in near real-time. Eliminated data silos across 12+ enterprise systems by implementing Collibra Data Governance, improving data lineage tracking by 75% and preventing critical reporting errors. Optimized Hive SQL processing for 50TB+ datasets to accelerate large-scale analytics from hours to minutes.
Data Scientist, Analytics at DATA FACTZ
November 1, 2021 - November 1, 2021
Built real-time analytics pipelines using Spark and Python to process 500GB+ daily data, reducing processing time and infra costs. Improved forecasting accuracy by 42% using LSTM models for time-series prediction, aiding inventory planning. Migrated job orchestration from Cron to Apache NiFi, boosting pipeline reliability. Deployed ML models (Random Forest, XGBoost, SVM) via Flask REST API for customer churn prediction.
Data Engineer at AAA (Contract - Knoac IT)
August 1, 2025 - August 1, 2025
Architected end-to-end data pipelines processing 200M+ daily records with 98.2% uptime, enabling near real-time access to critical business metrics. Eliminated data silos across 12+ enterprise systems by implementing a data governance framework, improving data lineage tracking by 75% and preventing critical reporting errors that previously cost the company $250K annually. Accelerated large-scale analytics from hours to minutes by optimizing SQL processing for 50TB+ datasets, enabling daily refreshes of key dashboards to support near real-time decision making.
Data Engineer at Kaleo Pharma (Contract - Knoac IT)
November 1, 2023 - November 1, 2023
Real-time data pipelines processing millions of daily records using Apache Airflow, Python, and AWS Redshift, achieving 98.2% uptime and reducing reporting delays from days to hours for critical operations. Developed seamless API integration between Salesforce and AWS using custom connectors to enable automated real-time data extraction; improved data freshness by 40%. Resolved persistent data quality issues via validated data models and master data mapping; designed ETL procedures that reduced processing time by 72% and supported daily dashboards.
Junior Data Scientist, Analytics at DATA FACTZ
November 1, 2021 - November 1, 2021
Built real-time analytics pipelines using Spark and Python to process 500GB+ daily data. Improved forecasting accuracy with LS(TM) neural networks for time-series, contributing to better inventory planning and waste reduction. Migrated job orchestration from Cron to Apache NiFi, improving pipeline reliability. Deployed ML models (Random Forest, XGBoost, SVM) via REST API (Flask) for customer churn prediction.

Education

Master of Science, Business Analytics at Texas A&M University-Commerce, Texas, United States
January 11, 2030 - November 7, 2025
Bachelor of Engineering at Deccan College of Engineering & Technology (Osmania University)
January 11, 2030 - November 7, 2025
Master of Science, Business Analytics at Texas A&M University-Commerce
January 11, 2030 - November 7, 2025
Bachelor of Engineering at Deccan College of Engineering & Tech (Osmania University)
January 11, 2030 - November 7, 2025
Master of Science, Business Analytics at Texas A&M University-Commerce
January 11, 2030 - November 7, 2025
Bachelor of Engineering at Deccan College of Engineering & Technology (Osmania University)
January 11, 2030 - November 7, 2025

Qualifications

Microsoft Certified: Fabric Data Engineer Associate
January 11, 2030 - November 7, 2025
Microsoft Certified: Azure AI Engineer Associate
January 11, 2030 - November 7, 2025
Microsoft Certified: Fabric Data Engineer Associate
January 11, 2030 - November 7, 2025
Microsoft Certified: Azure AI Engineer Associate
January 11, 2030 - November 7, 2025
Microsoft Certified: Fabric Data Engineer Associate
January 11, 2030 - November 7, 2025
Microsoft Certified: Azure AI Engineer Associate
January 11, 2030 - November 7, 2025

Industry Experience

Software & Internet, Professional Services, Media & Entertainment, Other, Computers & Electronics