Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure. I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.…I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure. I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.

Su dheer Kumar

Data Scientist, Data Analyst, Developer, +6





I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure. I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.…I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure. I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.

Available to hire

I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure.
I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Language

English

Fluent

Work Experience

Data Engineer at Clairvoyant

March 1, 2023 - Present

Designed lakehouse-style data platforms using Delta Lake and Snowflake to unify batch and streaming pipelines, reducing insight latency by 50%. Modeled analytics layers with dbt and orchestrated Airflow DAGs, improving code modularity and maintainability by 40%. Implemented real-time ingestion pipelines with Kafka and Spark (Databricks), lowering alert latency for operational dashboards by 70%. Created reverse ETL flows from Snowflake to marketing CRMs, improving lead tracking. Applied FinOps practices (S3 lifecycle, auto-scaling, resource tagging) to reduce cloud spend by 22%. Established SLA tracking and data validation using Great Expectations and Prometheus, decreasing production incidents. Automated Terraform-based provisioning of cloud and Kubernetes resources, accelerating onboarding by 60%.

Data Engineer at CueTech Systems

August 1, 2018 - January 31, 2022

Built scalable ETL and streaming pipelines using Spark, Kafka, and NiFi to support multi-source ingestion into Redshift and BigQuery. Migrated ETL workflows to Delta Lake for schema enforcement and time-travel, reducing data reprocessing incidents by 35%. Developed dbt-based models for curated finance and marketing layers consumed by BI teams in Tableau and Power BI. Tuned SQL queries in PostgreSQL, Redshift, and Oracle, achieving up to 40% runtime improvement. Created reverse ETL pipelines from BigQuery to Salesforce and HubSpot, enhancing CRM data accuracy. Implemented data governance using Apache Atlas and Alation for end-to-end lineage. Led automation of ETL deployment with Jenkins/GitLab CI, reducing change failure rate by 30%.