Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions. Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.

Siva Naga Chaitanya Illa

Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions. Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.

Available to hire

Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions.

Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
See more

Work Experience

Senior Data Engineer at McKesson
June 1, 2025 - Present
Designed and implemented end-to-end healthcare data pipelines ingesting EHR, claims, lab, pharmacy, and eligibility data from multiple source systems into cloud data lakes (Azure ADLS / AWS S3). Built scalable PySpark and Spark SQL transformation frameworks to standardize healthcare datasets across patient, provider, encounter, diagnosis, and procedure domains. Integrated HL7 and FHIR APIs, REST services, and batch ingestion, enabling near real-time clinical and operational analytics. Implemented HIPAA compliance standards, including encryption at rest and in transit, RBAC, secure key management, and masking of PHI/PII data elements. Developed incremental and CDC-based ingestion pipelines, reducing data latency and ensuring timely availability of healthcare data for downstream reporting. Implemented Slowly Changing Dimension (SCD Type 2) logic and designed healthcare-optimized data models in Snowflake / Azure Synapse. Built automated ETL/ELT workflows using Azure Data Factory / AWS Glu
Azure/ AWS Data Engineer at Zions Bancorporation
April 1, 2024 - May 1, 2025
Designed and developed big data pipelines in Azure Synapse using PySpark for large-scale data processing and analytics. Implemented end-to-end ingestion frameworks with Azure Data Factory, Azure Data Lake, and Cosmos DB as sources. Created Power BI dashboards to visualize processed datasets, enabling data-driven decisions. Performed data validation and quality checks on new data streams, ensuring high accuracy and regulatory compliance. Set up monitoring and alerting for ADF pipelines to proactively identify failures and performance issues. Maintained technical specifications and wiki documentation for new pipeline implementations. Collaborated using Azure DevOps (ADO) for version control, work item tracking, and CI/CD automation. Supported marketing campaign data integration and product telemetry pipelines, ensuring timely data delivery to analytics teams. Created parameterized notebooks in Databricks to process batch and streaming data, reducing manual intervention and increasing fle
Data Engineer at DXC Technology
June 1, 2022 - December 1, 2022
Integrated Azure Data Factory pipelines with diverse data sources, applying transformations for analytics readiness. Enhanced pipeline automation with ADO Git-based CI/CD workflows, reducing deployment times and errors. Built and managed data pipelines on AWS EC2, S3, EMR, Redshift, Lambda, ensuring robust and fault-tolerant architecture. Designed Spark-based data processing workflows, implementing RDD transformations, DataFrame API, and Spark SQL for efficient data handling. Integrated CI/CD pipelines using GitHub, AWS CodePipeline, Jenkins, and Elastic Beanstalk, accelerating deployment cycles and improving developer productivity. Optimized Redshift data warehouses, tuning complex queries, designing indexes, and enhancing cluster performance. Automated event-driven data processing using AWS Lambda with DynamoDB streams, enabling near real-time analytics and reporting. Designed and maintained Snowflake data models using WhereScape, supporting datasets for analytics and reporting pipel
Junior Data Engineer at Polestar Analytics
November 1, 2020 - May 1, 2022
Developed PySpark-based ETL pipelines in Azure Synapse, improving reporting dataset performance. Assisted in building Power BI reports for campaign performance and sales analysis. Refactored PySpark workflows to optimize memory usage and data processing efficiency. Implemented data validation pipelines using Great Expectations and custom Python scripts to ensure data accuracy before ingestion. Automated Airflow workflows to ensure timely data availability for reporting teams. Integrated e-commerce datasets from Myntra, Amazon, Flipkart, Nykaa via REST APIs for unified reporting. Designed data archival strategies using S3 lifecycle policies to optimize storage costs and improve data governance. Delivered processed datasets to BI teams and supported real-time dashboards.

Education

Electronics & Communication Engineering at JNTUK, India
January 11, 2030 - January 29, 2026
Information Technology & Project Management at Indiana Wesleyan University, Indian
January 11, 2030 - January 29, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Healthcare, Financial Services, Software & Internet, Professional Services, Other