Hi, I’m NIKHILSAI GALI, a Senior Azure Data Engineer with 11+ years of delivering enterprise-scale data platforms. I specialize in building scalable Lakehouse architectures on Azure Databricks, Delta Lake, Unity Catalog, and Medallion Architecture, optimizing PySpark pipelines, and automating data pipelines with Azure Data Factory, Synapse, and Fabric. My work spans ingestion, real-time streaming, data governance, and cost optimization across complex multi-source environments. I thrive in cross-functional teams, translating business needs into reliable data solutions that drive faster, compliant insights. I focus on data quality, security, and governance while delivering scalable, maintainable data architectures and CI/CD automation to accelerate delivery and reduce operational costs.

NIKHILSAI GALI

Hi, I’m NIKHILSAI GALI, a Senior Azure Data Engineer with 11+ years of delivering enterprise-scale data platforms. I specialize in building scalable Lakehouse architectures on Azure Databricks, Delta Lake, Unity Catalog, and Medallion Architecture, optimizing PySpark pipelines, and automating data pipelines with Azure Data Factory, Synapse, and Fabric. My work spans ingestion, real-time streaming, data governance, and cost optimization across complex multi-source environments. I thrive in cross-functional teams, translating business needs into reliable data solutions that drive faster, compliant insights. I focus on data quality, security, and governance while delivering scalable, maintainable data architectures and CI/CD automation to accelerate delivery and reduce operational costs.

Available to hire

Hi, I’m NIKHILSAI GALI, a Senior Azure Data Engineer with 11+ years of delivering enterprise-scale data platforms. I specialize in building scalable Lakehouse architectures on Azure Databricks, Delta Lake, Unity Catalog, and Medallion Architecture, optimizing PySpark pipelines, and automating data pipelines with Azure Data Factory, Synapse, and Fabric. My work spans ingestion, real-time streaming, data governance, and cost optimization across complex multi-source environments.

I thrive in cross-functional teams, translating business needs into reliable data solutions that drive faster, compliant insights. I focus on data quality, security, and governance while delivering scalable, maintainable data architectures and CI/CD automation to accelerate delivery and reduce operational costs.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert

Work Experience

Senior Azure Data Engineer at The OCC
April 1, 2022 - November 6, 2025
Architected enterprise-grade Lakehouse Architecture using Azure Databricks with Delta Lake, Unity Catalog, and Medallion Architecture, processing 15TB+ daily data volumes through PySpark optimization and Microsoft Fabric OneLake integration, implementing ACID transactions and comprehensive Data Governance frameworks with 99.9% data accuracy across production workloads. Developed large-scale Azure Data Factory (ADF) pipelines with Microsoft Fabric Data Pipelines orchestration, managing 200TB+ daily transactional data ingestion from 50+ heterogeneous sources (SQL Server, Oracle, Teradata, Snowflake, PostgreSQL) using Copy Activity, Mapping Data Flows, Lookup Activity, and Linked Services with automated CDC processes and batch processing capabilities. Engineered real-time streaming solutions utilizing Azure Stream Analytics, Azure Event Hubs with Kafka integration, and Delta Live Pipelines (DLT), processing 100K+ events per second while maintaining 99.9% uptime through automated failover
Senior Azure Data Engineer at US Bank
March 1, 2022 - March 1, 2022
Implemented comprehensive ETL platform using Azure Data Factory (ADF) Mapping Data Flows and Azure Data Lake Storage Gen2 (ADLS Gen2) for financial reporting, implementing Python parallel processing with Parquet and JSON file formats that handled 2TB daily loads and achieved 50% faster monthly close through optimized SQL data pipeline flows and Performance Tuning across Structured data and Semi-structured data sources. Developed Azure Databricks data processing workflows with Delta Lake ACID transactions and PySpark optimization, establishing automated Data Quality monitoring and Azure Synapse Analytics integration using Dimensional Modeling and Star Schema design that improved processing reliability by 75% while maintaining comprehensive SQL audit trails and Data Governance frameworks with Unity Catalog preview integration. Configured Azure Data Factory (ADF) Copy Activity and Lookup Activity with Azure Blob Storage integration and Python transformation logic, implementing Azure Key V
Big Data Engineer at Charter Communications
October 1, 2019 - October 1, 2019
Developed scalable Azure Data Factory (ADF) pipelines and orchestrated high-volume data ingestion from SAP HANA, Oracle, and on-prem systems into Azure Data Lake Storage Gen2, leveraging PySpark and Azure Databricks to process structured/semi-structured data efficiently across enterprise environments. Engineered and optimized Delta Lake architectures for batch and streaming data workflows, improving data reliability and end-to-end processing efficiency through Unity Catalog governance, ACID-compliant transactions, and schema enforcement. Designed and tuned dimensional data models within Azure Synapse Analytics, implementing T-SQL performance optimizations and partitioning strategies that accelerated report generation and dashboard responsiveness by 60%, improving user experience for business stakeholders. Automated CI/CD deployment workflows via Azure DevOps, integrating Key Vault for secure secrets management and enforcing RBAC policies to ensure secure, compliant, and auditable data
Data Warehouse Developer at CVS Health
June 1, 2017 - June 1, 2017
Designed, developed, and maintained enterprise data warehouse solutions using SQL Server, Oracle, and Teradata, ensuring data integrity, HIPAA compliance, and optimized query performance to support strategic analytics across pharmacy, retail, and healthcare operations. Built and automated complex ETL pipelines using SSIS, Informatica PowerCenter, Python, and PySpark to efficiently ingest and process large volumes of structured and semi-structured data from clinical systems, claims data, and third-party providers. Developed dimensional models and implemented star/snowflake schemas, slowly changing dimensions (SCDs), and master data management (MDM) to improve patient data accuracy, longitudinal record consistency, and reporting reliability across business units. Optimized stored procedures, triggers, and T-SQL queries to enhance data retrieval speed for BI reporting, enabling proactive care insights, formulary adherence tracking, and operational efficiency dashboards. Implemented data g

Education

Master of Science in Computer Science at University at Albany - State University of New York
January 11, 2030 - January 1, 2013
Bachelor of Science in Computer Science at Osmania University, Hyderabad
January 11, 2030 - January 1, 2011

Qualifications

Microsoft Certified: Azure Data Engineer Associate
January 1, 2023 - November 6, 2025
Microsoft Certified: Azure Fundamentals (AZ-900)
January 11, 2030 - November 6, 2025
Databricks Certified Data Engineer Professional
January 11, 2030 - November 6, 2025
Databricks Certified Associate Developer for Apache Spark
January 11, 2030 - November 6, 2025

Industry Experience

Financial Services, Healthcare, Professional Services, Software & Internet, Telecommunications