Available to hire
I’m a data engineer with 5+ years of experience delivering cloud-native data platforms across AWS, Azure, and GCP. I design, optimize, and operate high-volume pipelines using Apache NiFi-like orchestration (Airflow, Kafka, custom ETL frameworks) to enable reliable data ingestion and processing at scale.
I excel at Snowflake performance tuning, Linux administration, and infrastructure automation with Terraform and Ansible. I’m proficient in Python and SQL, collaborating with teams to deliver secure, scalable data solutions, with strong DevOps practices and monitoring across ELK, Grafana, Prometheus, and CloudWatch.
Skills
See more
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Language
English
Fluent
Work Experience
Azure Data Engineer at Goldman Sachs
April 1, 2024 - PresentDeveloped a reusable ETL framework using Spark and Hive, cutting data migration times by 50% and handling ingestion of over 1 million daily records into Azure Data Lake. Optimized Hive queries improving performance by 35%, integrated T-SQL into Azure DevOps for CI/CD automation, and migrated 5TB+ healthcare data to Snowflake with significant query speed improvements. Implemented containerization with Docker and automated deployments through CI/CD pipelines. Monitored server health and system performance using tools like Nagios, AWS CloudWatch, and ELK Stack, supporting proactive incident management.
AWS Data Engineer at Aledade
March 31, 2024 - August 26, 2025Led migration of 5TB+ healthcare data to Snowflake using Python, improving query speeds by 30%. Transitioned monolithic AWS services to serverless architectures with Lambda and Kinesis, achieving 25% cost reduction. Developed Spark Streaming apps for processing Kafka data, managed cloud automation and infrastructure with Puppet, Terraform, and containerized deployments. Maintained real-time monitoring dashboards via Kibana and Elasticsearch. Extensive usage of AWS services including S3, EC2, EMR, Redshift, and MongoDB.
GCP Data Engineer at ABB
November 30, 2022 - August 26, 2025Gathered requirements and designed Hadoop big data projects aligned with business needs. Automated data pipelines on Azure and GCP using Control-M and Airflow. Developed Spark and Databricks applications for customer analytics, built scalable cloud-native solutions leveraging GCP services like Dataproc, BigQuery, Dataflow, and Cloud Functions. Decreased data ingestion latency by 40% with streaming pipelines and created business intelligence reports using GCP Data Studio. Managed large datasets with Pandas and Cosmos DB.
Data Engineer at Tata Power
June 30, 2020 - August 26, 2025Engineered PySpark ETL pipelines processing 2TB daily, increasing throughput by 45%. Migrated datasets from ADLS Gen2 to Databricks via ADF pipelines and developed PySpark notebooks for data analysis. Used AWS Glue, Lambda, S3, Redshift and Terraform for scalable ETL pipeline automation. Managed Kafka ingestion from REST endpoints and used Spark on Hadoop YARN for large-scale analytics. Built Airflow DAGs and monitored pipelines with Ansible, Docker, Jenkins, and Bamboo for CI/CD orchestration. Optimized Snowflake queries to reduce cost and latency.
Azure Data Engineer at Goldman Sachs
April 1, 2024 - PresentDesigned and operated scalable ETL/data flows (Airflow, Kafka, Spark, Hive), ingesting 1M+ daily records into Azure Data Lake with secure data handling (TLS/SSL, RBAC). Optimized Snowflake queries and clustering for 5TB+ healthcare datasets, improving performance by 30%+, ensured HIPAA/SOC2 compliance. Automated deployments and infrastructure provisioning with Terraform, Ansible, and Azure DevOps. Strengthened Linux server administration and shell scripting for data pipeline automation. Built CI/CD pipelines integrating GitHub, Docker, and Jenkins for streamlined development workflows.
AWS Data Engineer at Aledade
March 31, 2024 - September 24, 2025Built real-time data flows with Kafka, Spark Streaming, and AWS Kinesis; developed serverless architectures (Lambda, S3, EC2, EMR, Redshift) with Terraform and Puppet for automated deployments. Configured Linux-based cloud environments with shell scripts for job management, log collection, and resource monitoring. Automated cloud data migrations and configurations, and created near real-time monitoring dashboards in Kibana and integrated logs into Elasticsearch for visibility. Containerized apps and set up CI/CD with Git, Jenkins, Gradle, and Kubernetes.
GCP Data Engineer at ABB
November 30, 2022 - September 24, 2025GCP Data Engineer who gathered requirements for Hadoop/Big Data projects; automated end-to-end data pipelines on Azure using Control-M; developed Spark apps in Databricks (Spark SQL, PySpark) for customer usage analytics; implemented Linux-based orchestration and Terraform-controlled provisioning; integrated data from APIs and storage into BigQuery with TLS/SSL; delivered monitoring with Grafana, Prometheus, and Stackdriver; translated requirements into compliant, cost-optimized cloud data solutions.
Data Engineer at Tata Power
June 30, 2020 - September 24, 2025Engineered PySpark ETL pipelines processing over 2TB of data daily, increasing data processing throughput by 45% and enabling timely reporting for operational teams. Migrated large datasets from ADLS Gen2 to Databricks using ADF pipelines, and developed PySpark notebooks for data extraction, transformation, and analysis in Databricks. Worked with AWS services including Glue, Lambda, S3, and Redshift; managed infrastructure with Terraform to ensure high availability and scalability. Used Kafka and Airflow for distributed messaging and automation. Strengthened infrastructure with Terraform and Ansible, automating provisioning and ensuring high availability. Performed Linux administration tasks to support data workflows. Built monitoring solutions with Splunk and Ansible playbooks to automate incident response and alerts.
Education
Master of Science in Computer Science at The University of Texas at Arlington
January 1, 2023 - December 31, 2024Master of Science in Computer Science at The University of Texas at Arlington, TX, USA
January 1, 2023 - December 31, 2024Qualifications
Industry Experience
Healthcare, Retail, Financial Services, Telecommunications, Software & Internet, Computers & Electronics, Professional Services
Skills
See more
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Hire a Developer
We have the best developer experts on Twine. Hire a developer in Dallas today.