I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure. I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.

Su dheer Kumar

I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure. I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.

Available to hire

I am Su dheer Kumar, a Data Engineer with over 5 years of experience designing scalable, cloud-native data platforms across AWS, GCP, and Azure.
I build real-time and batch pipelines using Spark (Databricks), Kafka, and Snowflake, following lakehouse principles. I orchestrate end-to-end data workflows with dbt and Airflow to enable low-latency analytics and reverse ETL, while collaborating with engineering, analytics, and compliance teams to align with HIPAA, SOC2, and GDPR standards and optimize cloud spend through FinOps.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

Data Engineer at Clairvoyant
March 1, 2023 - Present
Designed lakehouse-style data platforms using Delta Lake and Snowflake to unify batch and streaming pipelines, reducing insight latency by 50%. Modeled analytics layers with dbt and orchestrated Airflow DAGs, improving code modularity and maintainability by 40%. Implemented real-time ingestion pipelines with Kafka and Spark (Databricks), lowering alert latency for operational dashboards by 70%. Created reverse ETL flows from Snowflake to marketing CRMs, improving lead tracking. Applied FinOps practices (S3 lifecycle, auto-scaling, resource tagging) to reduce cloud spend by 22%. Established SLA tracking and data validation using Great Expectations and Prometheus, decreasing production incidents. Automated Terraform-based provisioning of cloud and Kubernetes resources, accelerating onboarding by 60%.
Data Engineer at CueTech Systems
August 1, 2018 - January 31, 2022
Built scalable ETL and streaming pipelines using Spark, Kafka, and NiFi to support multi-source ingestion into Redshift and BigQuery. Migrated ETL workflows to Delta Lake for schema enforcement and time-travel, reducing data reprocessing incidents by 35%. Developed dbt-based models for curated finance and marketing layers consumed by BI teams in Tableau and Power BI. Tuned SQL queries in PostgreSQL, Redshift, and Oracle, achieving up to 40% runtime improvement. Created reverse ETL pipelines from BigQuery to Salesforce and HubSpot, enhancing CRM data accuracy. Implemented data governance using Apache Atlas and Alation for end-to-end lineage. Led automation of ETL deployment with Jenkins/GitLab CI, reducing change failure rate by 30%.

Education

Master of Science in Management Information Systems at Hood College
January 11, 2030 - January 7, 2026
Master of Science in Management Information Systems at Hood College
January 11, 2030 - January 7, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services, Media & Entertainment