I’m a developer who specializes in designing and building cloud-native data solutions on Azure. I help businesses create scalable ETL/ELT pipelines, transform complex datasets, and deliver analytics-ready platforms. My expertise includes Azure Data Factory, Databricks, PySpark, Delta Lake, SQL, and Snowflake, with strong experience in data governance, performance optimization, and workflow automation. What sets me apart is my ability to handle end-to-end data engineering challenges — from ingestion and transformation to production-ready pipelines — with a focus on reliability, security, and cost-efficient cloud design. I have worked across insurance, healthcare, and pharmaceutical domains, giving me sector knowledge to understand your business needs quickly and deliver actionable data solutions. Employment and Project Experience Azure Data Engineer (Contract) – Intact Insurance, Canada February 2025 – Present Design and build ETL/ELT pipelines with ADF, Databricks, and PySpark Process large-scale datasets efficiently and ensure data quality, security, and compliance Implement CI/CD automation, workflow orchestration, and monitoring for production-ready pipelines Senior Data Engineer – Altimetrik (Novartis) February 2023 – February 2025 Developed distributed data processing systems using PySpark and Kafka Delivered reusable, parameterized pipelines for batch and streaming workloads Ensured data governance, security, and operational excellence across pipelines

Katta Rohini

I’m a developer who specializes in designing and building cloud-native data solutions on Azure. I help businesses create scalable ETL/ELT pipelines, transform complex datasets, and deliver analytics-ready platforms. My expertise includes Azure Data Factory, Databricks, PySpark, Delta Lake, SQL, and Snowflake, with strong experience in data governance, performance optimization, and workflow automation. What sets me apart is my ability to handle end-to-end data engineering challenges — from ingestion and transformation to production-ready pipelines — with a focus on reliability, security, and cost-efficient cloud design. I have worked across insurance, healthcare, and pharmaceutical domains, giving me sector knowledge to understand your business needs quickly and deliver actionable data solutions. Employment and Project Experience Azure Data Engineer (Contract) – Intact Insurance, Canada February 2025 – Present Design and build ETL/ELT pipelines with ADF, Databricks, and PySpark Process large-scale datasets efficiently and ensure data quality, security, and compliance Implement CI/CD automation, workflow orchestration, and monitoring for production-ready pipelines Senior Data Engineer – Altimetrik (Novartis) February 2023 – February 2025 Developed distributed data processing systems using PySpark and Kafka Delivered reusable, parameterized pipelines for batch and streaming workloads Ensured data governance, security, and operational excellence across pipelines

Available to hire

I’m a developer who specializes in designing and building cloud-native data solutions on Azure. I help businesses create scalable ETL/ELT pipelines, transform complex datasets, and deliver analytics-ready platforms. My expertise includes Azure Data Factory, Databricks, PySpark, Delta Lake, SQL, and Snowflake, with strong experience in data governance, performance optimization, and workflow automation.

What sets me apart is my ability to handle end-to-end data engineering challenges — from ingestion and transformation to production-ready pipelines — with a focus on reliability, security, and cost-efficient cloud design. I have worked across insurance, healthcare, and pharmaceutical domains, giving me sector knowledge to understand your business needs quickly and deliver actionable data solutions.

Employment and Project Experience

Azure Data Engineer (Contract) – Intact Insurance, Canada
February 2025 – Present

Design and build ETL/ELT pipelines with ADF, Databricks, and PySpark

Process large-scale datasets efficiently and ensure data quality, security, and compliance

Implement CI/CD automation, workflow orchestration, and monitoring for production-ready pipelines

Senior Data Engineer – Altimetrik (Novartis)
February 2023 – February 2025

Developed distributed data processing systems using PySpark and Kafka

Delivered reusable, parameterized pipelines for batch and streaming workloads

Ensured data governance, security, and operational excellence across pipelines

See more

Experience Level

Expert

Language

English
Fluent

Work Experience

Senior Data Engineer at Intact Insurance
February 3, 2025 - Present
Design and implement cloud-native ETL/ELT pipelines in ADF and Databricks to modernize legacy analytics platforms. Build incremental ingestion pipelines from SQL Server and Oracle into Azure Data Lake Storage Gen2 using PySpark and Delta Lake. Optimize Spark jobs using partitioning, caching, and broadcast joins, reducing runtime and compute costs. Develop and enforce data quality checks (DQT) and SQL-based business rules for accurate and compliant datasets. Manage Azure Data Lake Storage Gen2 environments including partitioning, tiered storage, and retention policies. Orchestrated end-to-end data workflows using Azure Data Factory, enabling seamless data movement and transformation between various sources and destinations. Configured and optimized Azure Data Lake Storage for storing and managing large volumes of structured and unstructured data, ensuring efficient data retrieval. Implemented data pipelines to ingest real-time data streams into Azure Data Lake Storage, ensuring up-to-date and accurate data availability for analytics. Leveraged Azure Databricks to perform advanced analytics and data processing tasks, utilizing the power of Apache Spark for distributed data processing. Developed and optimized Spark jobs in Python, Java and Scala, efficiently processing large-scale datasets and achieving significant performance improvements. Designed and implemented complex Spark SQL queries for data exploration, aggregation, and pattern recognition, enabling data-driven decision-making. Designed and executed ETL processes using Azure Data Factory, Python, and Spark, transforming raw data into a structured format suitable for analysis and reporting. Utilized Azure DevOps for version control and continuous integration, streamlining the deployment process and ensuring consistent code deployment across different environments. Utilized Databricks notebooks to develop, test, and optimize Spark-based data processing to improve data processing efficiency. Employed Databricks clusters to dynamically scale resources based on workload demands, ensuring optimal performance for data processing tasks. Designed and optimized complex SQL queries for data analysis and reporting, leveraging window functions, subqueries, and joins to extract valuable insights from relational databases. Developed Spark Streaming applications to process and analyze real-time data streams ingested from Kafka topics. Designed micro-batch processing logic in spark streaming to efficiently handle continuous data streams, enabling near-real-time insights and responses. Collaborate with architects and stakeholders to enhance enterprise cloud data architecture for high availability and compliance. Automate workflow orchestration and monitor pipeline execution using Airflow and Databricks Jobs.
Azure Data Engineer at Altimetrik
February 1, 2023 - December 31, 2025
Developed scalable ADF pipelines to ingest data from SAP, RDBMS, and AWS S3 into Azure Databricks and Snowflake. Created parameterized and reusable ETL/ELT pipelines to standardize transformation workflows. Built PySpark scripts for data cleansing, transformation, and aggregation for ESG and sustainability analytics. Delivered curated datasets to Power BI dashboards supporting ESG reporting and KPIs. Applied Spark optimization techniques including partitioning and caching to improve pipeline performance. Leveraged Azure Databricks as the core analytics engine, utilizing PySpark for distributed data processing, enabling efficient transformation, and analysis of large-scale datasets. Integrated Azure Data Lake Storage as the primary data repository, optimizing data storage and retrieval for large-scale datasets while ensuring secure data management practices. Implemented Azure Data Factory for orchestrating and automating ETL processes, ensuring seamless data movement between various data sources and destinations. Incorporated Apache Kafka for real-time data streaming, allowing the platform to handle high-throughput, low-latency data streams effectively. Leveraged Azure Synapse Analytics (formerly Azure SQL Data Warehouse) for building scalable and performant data warehousing solutions, enabling efficient storage and analysis of structured and unstructured data. Implemented serverless computing using Azure Functions for specific data processing tasks, optimizing resource utilization, and reducing operational overhead. Utilized Power BI for creating interactive dashboards and reports, providing stakeholders with real-time insights derived from processed data. Integrated streaming data via Kafka and batch ingestion via Sqoop from Oracle/PostgreSQL. Collaborated with cross-functional global teams to finalize data models and architecture.
BigData Developer at Carelon Global Solutions
January 14, 2019 - January 31, 2023
Developed high-volume Spark/Scala ETL pipelines processing millions of healthcare records across Hadoop & Azure. Designed, partitioned, and optimized Hive tables + Impala queries, improving performance by ~40%. Implemented Sqoop-based incremental + full loads from Oracle/SQL Server into Hadoop. Built HBase integrations to support low latency read/write analytics and operational workloads. Automated complex workflows using Airflow & Oozie, including alerts, retries, and dependency management. Designed and deployed ADF pipelines for hybrid (on-prem → ADLS) data ingestion and scheduled transformations. Developed scalable Azure Databricks notebooks (PySpark/Scala) for cleansing, transformation, and curation. Implemented data quality rules & validation frameworks, improving data accuracy by 28%. Used Azure DevOps for Git-based versioning, CI/CD automation, and release management. Performed Spark performance tuning (caching, coalescing, shuffle reduction) reduces run times and compute costs. Managed HDFS & ADLS storage including ACLs, RBAC, partition strategies, and schema evolution. Provided L2/L3 production support for Hadoop, ADF, and Databricks ensuring SLA and uptime compliance. Prepared clear technical documentation covering ETL logic, lineage, workflows, and architecture diagrams.
Azure Data Engineer at Intact Insurance
February 1, 2025 - Present
Designed and implemented cloud-native ETL/ELT pipelines using Azure Data Factory, Azure Databricks, and Microsoft Fabric for enterprise analytics modernization.
Senior Data Engineer at Altimetrik (Novartis)
February 1, 2023 - February 1, 2025
Developed scalable Azure Data Factory pipelines to ingest data from multiple sources into Azure Databricks and Snowflake for analytics and reporting.
Big Data Developer at Carelon Global Solutions (Elevance Health)
January 1, 2019 - January 1, 2023
Developed high-volume Spark ETL pipelines and optimized data processing across Hadoop and Azure platforms, maintaining compliance with healthcare standards.
Big Data Developer at Delta tech services
May 1, 2017 - December 1, 2018
Developed Spark batch processing jobs for data transformation and built Kafka Streams applications for real-time data enrichment.

Education

Computer Science & Engineering at JNTUH
June 6, 2013 - April 10, 2017
Bachelors
Bachelor of Technology (B.Tech) at Computer Science & Engineering
January 1, 2013 - January 1, 2017

Qualifications

PAHM – Professional, Academy for Healthcare Management
January 11, 2030 - January 8, 2026

Industry Experience

Healthcare, Software & Internet, Other, Professional Services
    paper Azure Data Engineering – Cloud-Native ETL & Data Pipelines

    Designed and built cloud-native ETL/ELT pipelines for Intact Insurance using Azure Data Factory, Azure Databricks, PySpark, SQL Server, Snowflake, and Delta Lake. The project focused on processing large-scale insurance datasets efficiently and reliably while ensuring data quality, governance, and security.

    Key highlights:

    Developed PySpark backend jobs for high-volume, low-latency data processing, improving pipeline performance by ~30%.

    Built orchestrated workflows using ADF and Airflow with retries, monitoring, and alerts for production-grade reliability.

    Integrated data from APIs, relational databases (MySQL, SQL Server), and event-driven sources, ensuring consistency and secure access.

    Applied data governance and role-based access controls using Unity Catalog and Azure security features.

    Automated CI/CD deployment using Git and Azure DevOps for repeatable, safe pipeline delivery.

    Supported identity-adjacent data flows, including user onboarding events, access-controlled datasets, and audit logging for compliance.

    This project delivered robust, production-ready data solutions that enabled analytics, reporting, and business insights across insurance operations.

Hire a Cloud Developer

We have the best cloud developer experts on Twine. Hire a cloud developer in Vancouver today.