Hello! I’m Safoura, a data engineer who builds high-performance data pipelines and architectures in cloud environments. I specialize in Python, PySpark, and Azure Databricks to power AI/ML workloads across the tech stack. I thrive in Agile teams, focusing on data quality, reliability, and scalability, and I enjoy mentoring colleagues in best practices and CI/CD automation.

Safoura Janosepah

Hello! I’m Safoura, a data engineer who builds high-performance data pipelines and architectures in cloud environments. I specialize in Python, PySpark, and Azure Databricks to power AI/ML workloads across the tech stack. I thrive in Agile teams, focusing on data quality, reliability, and scalability, and I enjoy mentoring colleagues in best practices and CI/CD automation.

Available to hire

Hello! I’m Safoura, a data engineer who builds high-performance data pipelines and architectures in cloud environments. I specialize in Python, PySpark, and Azure Databricks to power AI/ML workloads across the tech stack.

I thrive in Agile teams, focusing on data quality, reliability, and scalability, and I enjoy mentoring colleagues in best practices and CI/CD automation.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
See more

Work Experience

Software / Data Engineer at Pitt HexAI Research Lab, University of Pittsburgh
June 1, 2023 - November 25, 2025
Designed and deployed batch and streaming pipelines in Databricks using PySpark and Azure Data Factory to support AI/ML workloads. Integrated data quality checks, profiling, and validation steps to ensure reliable datasets for AI models. Leveraged GitHub Actions for CI/CD deployment of pipeline code and used Airflow for orchestration with automated health alerts. Performed advanced data exploration, debugging, and root cause analysis to resolve data quality issues in health datasets.
Senior Data Engineer at CGI Inc. and Scotia Bank
April 1, 2023 - April 1, 2023
Developed and optimized high-throughput pipelines using Python, PySpark, and Azure Databricks, improving ingestion efficiency by 20%. Created reusable CI/CD workflows in GitHub Actions for automated testing and deployment. Migrated structured/unstructured data to Azure, implementing Medallion architecture for analytics. Applied data profiling and validation frameworks and built monitoring dashboards with alerts for production pipeline health.
Data/ ML Engineer at AIX Inc.
May 1, 2022 - May 1, 2022
Designed scalable ETL pipelines in Databricks and Azure Data Factory for AI-driven analytics. Applied Python/PySpark for batch and streaming ingestion, integrating data quality frameworks. Collaborated with data science teams to troubleshoot and optimize pipelines for ML readiness.
Senior Software / Data Engineer at Mahyar
November 1, 2021 - November 1, 2021
Designed and implemented ETL/ELT workflows for structured and unstructured data utilizing Azure Data Factory, Databricks, and PySpark. Built and maintained data architectures integrating multiple systems via Azure and Databricks. Developed real-time and batch ingestion pipelines with orchestration in Airflow and monitoring, and provided CI/CD deployment training for Azure-based environments.
ML Engineer & Data Engineer at Wolseley Inc.
September 1, 2020 - September 1, 2020
Extracted data from Snowflake for transformation; implemented feature engineering and loaded data into Azure Data Lake Storage and Azure SQL Database. Automated pipeline execution in Azure Databricks with Airflow orchestration and custom monitoring. Performed advanced data exploration and performance tuning for prediction models.
Visitor Research Fellow at University of Toronto
September 1, 2019 - September 1, 2019
Collaborated with the IoT team on research and development initiatives. Developed query profiling and APIs using Python, MongoDB, and Elasticsearch. Built multiple SND Controllers with Docker Swarm and Kubernetes.
Senior Data Modeler & Oracle Developer at Payeshgaran Co.
August 1, 2017 - August 1, 2017
Developed and optimized complex queries using PL/SQL to create functions and procedures. Created and modified scripts and database objects (stored procedures, functions, triggers, packages, views) with Oracle for transactional databases. Troubleshot DB objects, packages, triggers, and ERP issues for updates.
Senior Data Engineer & PL/SQL Developer at Samaneh Saze Morvarid
August 1, 2016 - August 1, 2016
Worked with Business Units to develop, approve, and implement new databases for various applications (Accounting, HR, Automation Systems, Student Health Records, etc.). Developed and maintained ERP systems using RDBMS, Oracle Forms, SQL, and PL/SQL. Migrated data between Oracle and SQL Server and built complex database objects (Stored Procedures, Functions, Packages, Triggers).

Education

Certificate of Data Science & Artificial Intelligence at University of Toronto
January 11, 2030 - November 25, 2025
M.Sc. & B.Sc., Software Computer Engineering at Azad University
January 11, 2030 - November 25, 2025

Qualifications

Certificate of Data Science & Artificial Intelligence
January 11, 2030 - November 25, 2025

Industry Experience

Financial Services, Software & Internet, Professional Services

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
See more

Hire a Data Analyst

We have the best data analyst experts on Twine. Hire a data analyst in Vaughan today.