Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

With 4+ years of data engineering experience, I design and optimize large-scale data solutions across AWS, Azure, and Snowflake. I specialize in building end-to-end ETL/ELT pipelines, real-time streaming platforms, and data lakehouse architectures using Python, SQL, Spark, Kafka, and Airflow.\n\nI excel at data modeling, performance optimization, and reducing latency to deliver reliable datasets for analytics, BI, and ML workloads. I thrive in Agile environments, collaborating with cross-functional teams to deliver scalable data products that drive measurable business outcomes.…With 4+ years of data engineering experience, I design and optimize large-scale data solutions across AWS, Azure, and Snowflake. I specialize in building end-to-end ETL/ELT pipelines, real-time streaming platforms, and data lakehouse architectures using Python, SQL, Spark, Kafka, and Airflow.\n\nI excel at data modeling, performance optimization, and reducing latency to deliver reliable datasets for analytics, BI, and ML workloads. I thrive in Agile environments, collaborating with cross-functional teams to deliver scalable data products that drive measurable business outcomes.

Bhagya Sree Akula

Data Analyst, Data Scientist, Full Stack Developer, +2





With 4+ years of data engineering experience, I design and optimize large-scale data solutions across AWS, Azure, and Snowflake. I specialize in building end-to-end ETL/ELT pipelines, real-time streaming platforms, and data lakehouse architectures using Python, SQL, Spark, Kafka, and Airflow.\n\nI excel at data modeling, performance optimization, and reducing latency to deliver reliable datasets for analytics, BI, and ML workloads. I thrive in Agile environments, collaborating with cross-functional teams to deliver scalable data products that drive measurable business outcomes.…With 4+ years of data engineering experience, I design and optimize large-scale data solutions across AWS, Azure, and Snowflake. I specialize in building end-to-end ETL/ELT pipelines, real-time streaming platforms, and data lakehouse architectures using Python, SQL, Spark, Kafka, and Airflow.\n\nI excel at data modeling, performance optimization, and reducing latency to deliver reliable datasets for analytics, BI, and ML workloads. I thrive in Agile environments, collaborating with cross-functional teams to deliver scalable data products that drive measurable business outcomes.

Available to hire

With 4+ years of data engineering experience, I design and optimize large-scale data solutions across AWS, Azure, and Snowflake. I specialize in building end-to-end ETL/ELT pipelines, real-time streaming platforms, and data lakehouse architectures using Python, SQL, Spark, Kafka, and Airflow.\n\nI excel at data modeling, performance optimization, and reducing latency to deliver reliable datasets for analytics, BI, and ML workloads. I thrive in Agile environments, collaborating with cross-functional teams to deliver scalable data products that drive measurable business outcomes.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Work Experience

Data Engineer at Cisco Systems

December 1, 2024 - November 6, 2025

Built batch and real-time data pipelines in Python to ingest ~60M records weekly from APIs and log feeds into Snowflake and S3, enabling 5-minute telemetry analytics. Reduced ETL runtime from 8 hours to 90 minutes and cut Snowflake credits by 35% through incremental CDC loads using Snowpipe/Streams with partitioning and clustering. Engineered PySpark transformations and schema normalization on Databricks, improving downstream query performance by 40% and accelerating reporting for operations and customer success. Created reusable dbt models (staging/core/marts) with lineage, tests, and standardization, accelerating model rollout time by 60%. Designed Snowflake RBAC, masking policies, and object tagging to meet compliance. Implemented data validation with Great Expectations, auto-remediating 1.2K anomalies per month and improving data integrity by 38%. Optimized BI workloads by tuning SQL (views, clustering, pruning), reducing BI query time from 14 minutes to under 5 minutes for monthly

Data Engineer at Citius Tech

July 1, 2023 - July 1, 2023

Orchestrated NiFi ingestion pipelines across REST, SFTP, databases with schema registry and back-pressure, processing 25K+ records per day and reducing manual effort by 72%. Built dbt staging layers and curated marts with tests and documentation, reducing schema drift incidents by 50% and accelerating cross-team development. Migrated 1.2 TB Hadoop/HDFS logs nightly to Amazon Redshift using Python with parallelized COPY operations, improving query speed by 35%. Optimized Redshift schema design (distribution/sort keys, time-based partitioning), cutting KPI query times from 14 minutes to under 5 minutes. Implemented data quality frameworks with Great Expectations, remediating 3,400+ anomalies and improving integrity across completeness, uniqueness, and referential checks. Developed AWS Glue ETL jobs in PySpark to unify clinical and insurance data, processing 3M+ records/day and reducing integration latency by 48%. Automated data lineage and observability with dbt exposures, NiFi provenanc

Data Engineer at Hexaware Technologies

October 1, 2021 - October 1, 2021

Migrated transactional logs from on-prem SQL Server to Azure Data Lake Storage Gen2 using Azure Data Factory and parallelized copy operations, moving over 800M+ rows during weekend windows and reducing the ingestion window by 65%. Engineered PySpark jobs on Azure Databricks to aggregate 50 million user events daily, delivering bronze/silver/gold layers for downstream analytics and reporting. Implemented CDC pipelines with Kafka Connect (SQL Server source) and custom Python consumers, processing 2M change events per day with end-to-end latency under 5 minutes. Designed Parquet storage layouts with optimized partitioning and file sizing (target file sizes, compression), cutting BI query times by 42% and storage costs by 25%. Orchestrated nightly transformations using Apache Airflow (DAGs, retries, SLAs, task dependencies), coordinating 20+ tasks and delivering production datasets by 5AM each business day. Enforced schema contracts via JSON specifications and Python unit tests (pydantic/p

Data Analyst at Hexaware Technologies

June 1, 2021 - June 1, 2021

Built SQL-based analytical reports and ad hoc queries on Azure SQL and Databricks SQL to track user behavior, funnel drop-offs, and product adoption trends. Developed interactive dashboards in Power BI for weekly executive reviews, visualizing KPIs such as active users, session frequency, latency SLAs, and incident counts. Performed data profiling, cohort analysis, and A/B test readouts; defined data quality acceptance criteria with stakeholders and formalized metric definitions in a shared data dictionary. Optimized BI queries through aggregation tables, materialized views, and partition pruning, reducing dashboard load times by 35%. Collaborated with product and operations teams to translate business questions into analytical datasets and scheduled insights to improve decision lead time for quarterly planning.