Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions. Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.…Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions. Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.

Siva Naga Chaitanya Illa

Data Scientist, Data Analyst, Full Stack Developer, +2





Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions. Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.…Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions. Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.

Available to hire

Hi, I’m Siva Naga Chaitanya Illa, a data engineer with around 5 years of experience designing, building, and optimizing scalable data pipelines across cloud platforms (AWS, Azure, GCP) and on-prem environments. I specialize in end-to-end data architectures using Azure Synapse, PySpark, and Azure Data Factory, delivering analytics-ready datasets for healthcare, finance, and business operations. I’m passionate about data quality, automation, and enabling actionable insights through robust ETL/ELT workflows and collaborative BI solutions.

Beyond delivering reliable data, I focus on performance tuning, governance, and secure data handling for HIPAA-compliant healthcare data and regulated financial data. I enjoy collaborating with analysts, compliance, and product teams to translate requirements into scalable, maintainable solutions, and I frequently implement CI/CD, monitoring, and automated testing to reduce manual effort and accelerate deployments.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Intermediate

Intermediate

Intermediate

Work Experience

Senior Data Engineer at McKesson

June 1, 2025 - Present

Designed and implemented end-to-end healthcare data pipelines ingesting EHR, claims, lab, pharmacy, and eligibility data from multiple source systems into cloud data lakes (Azure ADLS / AWS S3). Built scalable PySpark and Spark SQL transformation frameworks to standardize healthcare datasets across patient, provider, encounter, diagnosis, and procedure domains. Integrated HL7 and FHIR APIs, REST services, and batch ingestion, enabling near real-time clinical and operational analytics. Implemented HIPAA compliance standards, including encryption at rest and in transit, RBAC, secure key management, and masking of PHI/PII data elements. Developed incremental and CDC-based ingestion pipelines, reducing data latency and ensuring timely availability of healthcare data for downstream reporting. Implemented Slowly Changing Dimension (SCD Type 2) logic and designed healthcare-optimized data models in Snowflake / Azure Synapse. Built automated ETL/ELT workflows using Azure Data Factory / AWS Glu

Azure/ AWS Data Engineer at Zions Bancorporation

April 1, 2024 - May 1, 2025

Designed and developed big data pipelines in Azure Synapse using PySpark for large-scale data processing and analytics. Implemented end-to-end ingestion frameworks with Azure Data Factory, Azure Data Lake, and Cosmos DB as sources. Created Power BI dashboards to visualize processed datasets, enabling data-driven decisions. Performed data validation and quality checks on new data streams, ensuring high accuracy and regulatory compliance. Set up monitoring and alerting for ADF pipelines to proactively identify failures and performance issues. Maintained technical specifications and wiki documentation for new pipeline implementations. Collaborated using Azure DevOps (ADO) for version control, work item tracking, and CI/CD automation. Supported marketing campaign data integration and product telemetry pipelines, ensuring timely data delivery to analytics teams. Created parameterized notebooks in Databricks to process batch and streaming data, reducing manual intervention and increasing fle

Data Engineer at DXC Technology

June 1, 2022 - December 1, 2022

Integrated Azure Data Factory pipelines with diverse data sources, applying transformations for analytics readiness. Enhanced pipeline automation with ADO Git-based CI/CD workflows, reducing deployment times and errors. Built and managed data pipelines on AWS EC2, S3, EMR, Redshift, Lambda, ensuring robust and fault-tolerant architecture. Designed Spark-based data processing workflows, implementing RDD transformations, DataFrame API, and Spark SQL for efficient data handling. Integrated CI/CD pipelines using GitHub, AWS CodePipeline, Jenkins, and Elastic Beanstalk, accelerating deployment cycles and improving developer productivity. Optimized Redshift data warehouses, tuning complex queries, designing indexes, and enhancing cluster performance. Automated event-driven data processing using AWS Lambda with DynamoDB streams, enabling near real-time analytics and reporting. Designed and maintained Snowflake data models using WhereScape, supporting datasets for analytics and reporting pipel

Junior Data Engineer at Polestar Analytics

November 1, 2020 - May 1, 2022

Developed PySpark-based ETL pipelines in Azure Synapse, improving reporting dataset performance. Assisted in building Power BI reports for campaign performance and sales analysis. Refactored PySpark workflows to optimize memory usage and data processing efficiency. Implemented data validation pipelines using Great Expectations and custom Python scripts to ensure data accuracy before ingestion. Automated Airflow workflows to ensure timely data availability for reporting teams. Integrated e-commerce datasets from Myntra, Amazon, Flipkart, Nykaa via REST APIs for unified reporting. Designed data archival strategies using S3 lifecycle policies to optimize storage costs and improve data governance. Delivered processed datasets to BI teams and supported real-time dashboards.