I’m Sathvika Reddy, a Senior Cloud Data Engineer with 10 years of experience building scalable multi‑cloud data platforms across AWS, GCP, and Azure. I love turning complex data into reliable, high‑impact solutions that empower underwriting, risk assessment, and business decisions. From real-time streaming to ML-powered analytics and governance, I craft end-to-end pipelines, containerized deployments, and automated CI/CD to drive faster insights and measurable value for stakeholders.

Sathvika Reddy

I’m Sathvika Reddy, a Senior Cloud Data Engineer with 10 years of experience building scalable multi‑cloud data platforms across AWS, GCP, and Azure. I love turning complex data into reliable, high‑impact solutions that empower underwriting, risk assessment, and business decisions. From real-time streaming to ML-powered analytics and governance, I craft end-to-end pipelines, containerized deployments, and automated CI/CD to drive faster insights and measurable value for stakeholders.

Available to hire

I’m Sathvika Reddy, a Senior Cloud Data Engineer with 10 years of experience building scalable multi‑cloud data platforms across AWS, GCP, and Azure. I love turning complex data into reliable, high‑impact solutions that empower underwriting, risk assessment, and business decisions.

From real-time streaming to ML-powered analytics and governance, I craft end-to-end pipelines, containerized deployments, and automated CI/CD to drive faster insights and measurable value for stakeholders.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert

Work Experience

Senior GCP Data Engineer at The Texas Farm Bureau Group
April 1, 2023 - Present
Architected and deployed scalable multi-cloud data platforms on GCP and Azure processing multi-terabyte datasets for enterprise underwriting and risk assessment, supporting 1M+ policy evaluations annually. Engineered real-time streaming pipelines using Kafka and GCP Pub/Sub delivering sub-second underwriting insights, cutting decision latency by 60%. Optimized BigQuery and Snowflake workloads via partitioning, clustering, and materialized views, achieving 35% faster queries and $50K annual cost savings. Developed ML pipelines with MLflow and BigQuery ML to improve underwriting risk scoring by 35%, reducing claim losses by 20%. Containerized 50+ data workloads with Docker and deployed on Kubernetes, automating infrastructure with Terraform to reduce deployment time by 40% and eliminate drift. Implemented CI/CD with Jenkins and GitHub Actions; designed REST APIs and event-driven microservices; orchestrated workflows with Airflow/Cloud Composer; ensured SOC2 compliance.
Senior Data Engineer at M&T Bank
March 1, 2021 - April 1, 2023
Built enterprise-scale multi-cloud data platforms across AWS and GCP to support personalized banking solutions for 10M+ customers. Developed distributed Spark and PySpark pipelines processing 3TB+ daily transaction data with 99.95% data quality. Implemented real-time ingestion pipelines using Kafka, AWS Kinesis, and GCP Pub/Sub handling 500K events/second with sub-100ms latency. Designed and optimized Snowflake, BigQuery, and Redshift warehouses reducing query runtimes by 35% through partitioning and clustering. Engineered ML pipelines with Spark MLlib and Databricks for customer segmentation and churn prediction achieving 92% model accuracy. Built 100+ REST APIs and event-driven microservices using FastAPI; containerized pipelines with Docker and Kubernetes reducing infrastructure costs by 30%. Implemented CI/CD pipelines with Jenkins, GitHub Actions, and Azure DevOps automating 150+ data workflows. Delivered Power BI and Tableau dashboards; automated data acquisition reducing manual
Data Engineer at State of Texas
September 1, 2018 - February 1, 2021
Designed hybrid cloud architectures across AWS and Azure supporting Texas Public Services Portal serving 5M+ citizens. Built end-to-end ETL pipelines using AWS Glue, Azure Data Factory, and PySpark processing 2TB+ daily government datasets. Implemented Delta Lake on Azure Databricks enabling auditable, versioned datasets for compliance and regulatory reporting. Developed ML pipelines using Azure ML and AWS SageMaker for predictive analytics improving service delivery efficiency by 25%. Designed REST APIs and event-driven microservices for secure data services with OAuth2/JWT. Containerized ETL workloads using Docker and AKS achieving 99.8% availability. Automated infrastructure provisioning using Terraform and CloudFormation reducing manual deployment by 60%. Integrated analytics with Power BI dashboards providing real-time executive reporting.
AWS Data Engineer at Molina Healthcare
June 1, 2015 - August 1, 2018
Designed enterprise data lakes on AWS S3 storing 10PB+ healthcare datasets with lifecycle policies reducing storage costs by 40%. Developed ETL pipelines using AWS Glue, EMR (PySpark), and Lambda processing 1TB+ daily HIPAA-compliant healthcare data. Implemented real-time streaming pipelines using AWS Kinesis processing 200K health events/second for predictive monitoring. Built Delta Lake pipelines enabling versioned, auditable datasets for regulatory compliance and clinical analytics. Engineered ML pipelines using AWS SageMaker and PySpark for predictive health risk scoring achieving 88% accuracy. Designed secure REST APIs with IAM-based authentication. Containerized data processing workloads using Docker and Kubernetes improving resource utilization by 35%. Automated infrastructure provisioning using Terraform and CloudFormation ensuring consistent deployments. Optimized Redshift clusters reducing query execution time by 45% through distribution keys and sort keys. Developed Power BI

Education

Master of Science in Data Science at University of Texas at Dallas
January 11, 2030 - January 7, 2026
Bachelor of Technology in Electrical & Electronics Engineering at Bhoj Reddy Engineering College for Women
January 11, 2030 - January 7, 2026

Qualifications

Microsoft Certified: DP-203 Data Engineering on Microsoft Azure
January 11, 2030 - January 7, 2026
Microsoft Certified: Azure Fundamentals
January 11, 2030 - January 7, 2026
Google Data Analytics Professional Certification
January 11, 2030 - January 7, 2026
SQL Certification - University of Colorado Boulder
January 11, 2030 - January 7, 2026
Cisco Verified: Introduction to Data Science
January 11, 2030 - January 7, 2026

Industry Experience

Financial Services, Healthcare, Government, Professional Services, Software & Internet