I’m Sumedha Chaluvadi, a data engineer with 4+ years of experience building and scaling cloud-based data platforms across healthcare, security, and analytics domains. I design ETL/ELT, streaming, and analytics pipelines using AWS, Azure, and GCP, with a strong focus on Spark, Kafka, Airflow, and dbt. I enjoy turning complex data into reliable, actionable insights and collaborating with product, security, and BI teams to empower data-driven decisions. I’ve delivered high-volume batch and real-time pipelines, improved data reliability, and accelerated onboarding of new data sources. I value automation, data quality, and enabling self-serve analytics for business users through tools like Tableau and Power BI.

Sumedha Chaluvadi

I’m Sumedha Chaluvadi, a data engineer with 4+ years of experience building and scaling cloud-based data platforms across healthcare, security, and analytics domains. I design ETL/ELT, streaming, and analytics pipelines using AWS, Azure, and GCP, with a strong focus on Spark, Kafka, Airflow, and dbt. I enjoy turning complex data into reliable, actionable insights and collaborating with product, security, and BI teams to empower data-driven decisions. I’ve delivered high-volume batch and real-time pipelines, improved data reliability, and accelerated onboarding of new data sources. I value automation, data quality, and enabling self-serve analytics for business users through tools like Tableau and Power BI.

Available to hire

I’m Sumedha Chaluvadi, a data engineer with 4+ years of experience building and scaling cloud-based data platforms across healthcare, security, and analytics domains. I design ETL/ELT, streaming, and analytics pipelines using AWS, Azure, and GCP, with a strong focus on Spark, Kafka, Airflow, and dbt. I enjoy turning complex data into reliable, actionable insights and collaborating with product, security, and BI teams to empower data-driven decisions.

I’ve delivered high-volume batch and real-time pipelines, improved data reliability, and accelerated onboarding of new data sources. I value automation, data quality, and enabling self-serve analytics for business users through tools like Tableau and Power BI.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Advanced

Work Experience

Data Engineer I at Takeda Pharmaceuticals
August 1, 2023 - Present
Loaded clinical and manufacturing data from APIs and databases into S3 and Redshift using AWS Glue, Python, and SQL, stabilizing daily analytics delivery and reducing manual reporting by 40%. Replaced legacy batch jobs with metadata-driven pipelines using Airflow and dbt, enabling 35% faster onboarding of new data sources. Implemented Kafka and Spark Structured Streaming to stream manufacturing events at 50K+ records/sec for near real-time plant visibility. Refactored data models into star/snowflake schemas in Redshift and Snowflake, boosting complex query performance by 30%. Implemented data quality checks with Great Expectations, reducing reporting errors by 35%. Improved large analytics job throughput by tuning Spark/Databricks workloads; automated deployments with Jenkins, Git, Docker, Kubernetes. Enabled self-service reporting with Tableau/Power BI connected to Redshift and Athena, cutting ad-hoc SQL requests by 40%.
Data Analytics Engineer at JumpCloud
February 1, 2020 - July 1, 2022
Implemented automated data ingestion from REST APIs, PostgreSQL, and S3 using Python, SQL, and Airflow, expanding analytics-ready datasets by 50%. Redesign legacy batch jobs into ELT pipelines in Snowflake with dbt and AWS Glue, reducing data delivery time by 45% and enabling multiple daily metric refreshes. Processed large volumes of identity/security logs with Spark (PySpark/Scala) on Hadoop; improved job stability by tuning partitions and memory. Introduced star/snowflake schemas in Snowflake to accelerate BI queries and dashboards. Centralized orchestration by linking Airflow with AWS Glue and Azure Data Factory, reducing maintenance by 30%. Implemented Python/SQL-based validation with Great Expectations, reducing inconsistencies and increasing KPI confidence. Managed schema changes through metadata-driven controls to prevent Tableau/Power BI breakages. Collaborated with product managers, security analysts, and BI teams to model reliable user activity and access events.

Education

Master of Science in Computer Science at Northern Arizona University
August 1, 2022 - December 1, 2023

Qualifications

AWS Certified Solutions Architect - Associate
January 11, 2030 - January 30, 2026
AWS Certified Cloud Practitioner - Associate
January 11, 2030 - January 30, 2026
Microsoft Certified: Azure Data Engineer Associate (DP-203)
January 11, 2030 - January 30, 2026
Microsoft Certified: Azure AI Engineer Associate
January 11, 2030 - January 30, 2026
Google Data Analytics Professional
January 11, 2030 - January 30, 2026
IBM Data Engineering Professional Certificate - Coursera
January 11, 2030 - January 30, 2026

Industry Experience

Healthcare, Manufacturing, Software & Internet, Professional Services