Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am a cloud data engineer who designs and builds scalable batch and real-time data pipelines on Google Cloud Platform and other modern tools. I enjoy solving complex data problems, automating workflows, and delivering actionable analytics through dashboards. I thrive in cross-functional teams and love turning data into insights that drive business impact. In my work, I combine practical engineering with a focus on performance and maintainability, always aiming for robust, scalable solutions and clear communication with data and infra teams.…I am a cloud data engineer who designs and builds scalable batch and real-time data pipelines on Google Cloud Platform and other modern tools. I enjoy solving complex data problems, automating workflows, and delivering actionable analytics through dashboards. I thrive in cross-functional teams and love turning data into insights that drive business impact. In my work, I combine practical engineering with a focus on performance and maintainability, always aiming for robust, scalable solutions and clear communication with data and infra teams.

Ruthvik Kumar Yadav

Data Scientist, Web Developer, Programmer, +2





I am a cloud data engineer who designs and builds scalable batch and real-time data pipelines on Google Cloud Platform and other modern tools. I enjoy solving complex data problems, automating workflows, and delivering actionable analytics through dashboards. I thrive in cross-functional teams and love turning data into insights that drive business impact. In my work, I combine practical engineering with a focus on performance and maintainability, always aiming for robust, scalable solutions and clear communication with data and infra teams.…I am a cloud data engineer who designs and builds scalable batch and real-time data pipelines on Google Cloud Platform and other modern tools. I enjoy solving complex data problems, automating workflows, and delivering actionable analytics through dashboards. I thrive in cross-functional teams and love turning data into insights that drive business impact. In my work, I combine practical engineering with a focus on performance and maintainability, always aiming for robust, scalable solutions and clear communication with data and infra teams.

Available to hire

I am a cloud data engineer who designs and builds scalable batch and real-time data pipelines on Google Cloud Platform and other modern tools. I enjoy solving complex data problems, automating workflows, and delivering actionable analytics through dashboards. I thrive in cross-functional teams and love turning data into insights that drive business impact.

In my work, I combine practical engineering with a focus on performance and maintainability, always aiming for robust, scalable solutions and clear communication with data and infra teams.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Work Experience

Cloud Data Engineer at Independent Projects (GCP, Kafka, Spark)

November 1, 2025 - November 1, 2025

Designed and deployed an end-to-end batch ETL pipeline for 2.7M+ NYC Taxi records using GCS → Dataproc (PySpark) → BigQuery. Automated ingestion of raw CSVs into GCS with Python scripts and Airflow DAGs for daily processing. Implemented PySpark jobs to clean, transform, and partition data for analytics. Loaded optimized, partitioned data into BigQuery and created analytical views for revenue trends, trip behaviors, and KPIs. Built an interactive Looker Studio dashboard for insights. Automated ingestion, transformation, and orchestration reduced manual workload by 80%. Also built a real-time streaming ETL pipeline ingesting taxi trip events via Apache Kafka and processing with Spark Structured Streaming; configured brokers/topics, parsed JSON events, normalized timestamps, and wrote continuous micro-batch outputs with checkpointing.