Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am a Data Engineer with over 4 years of professional experience building a range of data solutions across AWS, Azure, and Google Cloud platforms. I specialize in developing ETL pipelines, data lakes, and data warehouses, as well as automating systems that ensure data quality and reliability. Leveraging technologies like Apache Spark, Databricks, Airflow, Power BI, and SQL, I optimize data processing and empower business teams to extract actionable insights. I possess a deep understanding of both legacy and modern data environments, including infrastructure-as-code with Terraform and cloud security via Identity Access Management. Known for my collaborative spirit, I excel in translating cross-functional workflows from academic research to enterprise-level industry applications, always aiming to deliver scalable, secure, and efficient data solutions.…I am a Data Engineer with over 4 years of professional experience building a range of data solutions across AWS, Azure, and Google Cloud platforms. I specialize in developing ETL pipelines, data lakes, and data warehouses, as well as automating systems that ensure data quality and reliability. Leveraging technologies like Apache Spark, Databricks, Airflow, Power BI, and SQL, I optimize data processing and empower business teams to extract actionable insights. I possess a deep understanding of both legacy and modern data environments, including infrastructure-as-code with Terraform and cloud security via Identity Access Management. Known for my collaborative spirit, I excel in translating cross-functional workflows from academic research to enterprise-level industry applications, always aiming to deliver scalable, secure, and efficient data solutions.

Dhruv Patel





I am a Data Engineer with over 4 years of professional experience building a range of data solutions across AWS, Azure, and Google Cloud platforms. I specialize in developing ETL pipelines, data lakes, and data warehouses, as well as automating systems that ensure data quality and reliability. Leveraging technologies like Apache Spark, Databricks, Airflow, Power BI, and SQL, I optimize data processing and empower business teams to extract actionable insights. I possess a deep understanding of both legacy and modern data environments, including infrastructure-as-code with Terraform and cloud security via Identity Access Management. Known for my collaborative spirit, I excel in translating cross-functional workflows from academic research to enterprise-level industry applications, always aiming to deliver scalable, secure, and efficient data solutions.…I am a Data Engineer with over 4 years of professional experience building a range of data solutions across AWS, Azure, and Google Cloud platforms. I specialize in developing ETL pipelines, data lakes, and data warehouses, as well as automating systems that ensure data quality and reliability. Leveraging technologies like Apache Spark, Databricks, Airflow, Power BI, and SQL, I optimize data processing and empower business teams to extract actionable insights. I possess a deep understanding of both legacy and modern data environments, including infrastructure-as-code with Terraform and cloud security via Identity Access Management. Known for my collaborative spirit, I excel in translating cross-functional workflows from academic research to enterprise-level industry applications, always aiming to deliver scalable, secure, and efficient data solutions.

Available to hire

I am a Data Engineer with over 4 years of professional experience building a range of data solutions across AWS, Azure, and Google Cloud platforms. I specialize in developing ETL pipelines, data lakes, and data warehouses, as well as automating systems that ensure data quality and reliability. Leveraging technologies like Apache Spark, Databricks, Airflow, Power BI, and SQL, I optimize data processing and empower business teams to extract actionable insights.

I possess a deep understanding of both legacy and modern data environments, including infrastructure-as-code with Terraform and cloud security via Identity Access Management. Known for my collaborative spirit, I excel in translating cross-functional workflows from academic research to enterprise-level industry applications, always aiming to deliver scalable, secure, and efficient data solutions.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Work Experience

RA - Data Engineer at University of Toronto

April 30, 2025 - August 26, 2025

Created and managed ETL pipelines with data validation to ensure accuracy and integrity. Architected ETL frameworks for seamless data transfers to data warehouses. Designed scalable data lakes in Amazon S3, improving retrieval speed by 30% using intelligent tiering and lifecycle policies. Deployed and maintained data pipelines on EC2 with Terraform automation achieving 99.9% uptime. Implemented fine-grained IAM access for data security. Built and optimized data warehouses with Snowflake and Amazon Redshift, reducing query times by up to 40%. Developed event-driven ETL using AWS Lambda and S3 triggers, minimizing latency and costs. Processed over 5TB datasets with Apache Spark (PySpark) and integrated scikit-learn models for batch inference. Reduced ETL failures by 50% through performance tuning and schema fixes. Created interactive Power BI dashboards for near real-time metrics. Collaborated with data scientists to deploy real-time TensorFlow ML models. Used Amazon Athena for ad-hoc SQ

Data Engineer at Ellisdon

April 30, 2024 - August 26, 2025

Built scalable ETL pipelines using Python, Spark, Azure Data Factory, Data Lake, and Azure SQL for efficient data ingestion. Maintained and improved Databricks notebooks to reduce cloud infrastructure costs. Implemented Apache Airflow to automate data workflows and metadata management, increasing visibility. Introduced Git collaboration standards for data engineering and analytics teams. Conducted governance on data lake quality and modeled data with Spark SQL and PySpark to create Lakehouse and warehouses. Applied DevOps practices integrating CI/CD pipelines and infrastructure automation for reliable deployments. Optimized SQL databases and data warehouses for 60-70% query speed improvement. Updated Teradata SQL data warehouse models and ensured seamless data validation integrating Google BigQuery.

Assistant Manager at Atul Ltd

November 30, 2022 - August 26, 2025

Designed and optimized schema and dimensional data models on Hadoop Big Data platform. Reduced warehouse costs by 15% by optimizing table distribution based on data skew. Orchestrated batch data workflows using PySpark, reducing execution times by 20%. Developed scalable, clean, and performant data pipelines for high-traffic environments using Kubernetes. Utilized SQL and Spark for analytics and data transformations, cutting processing time by 15%. Collaborated cross-functionally with R&D teams to understand workflows and draft data warehouse management solutions supporting drug discovery.