I am a Data Engineer with over 7 years of experience designing and implementing cloud-based and on-premises data solutions. I specialize in Databricks, PySpark, and Kafka, and have a proven track record migrating warehouses to cloud-native platforms, building scalable batch and streaming pipelines, and delivering BI-ready datasets for analytics and ML. I enjoy collaborating with platform and product teams to create data products, implementing governance and security, and enabling self-service analytics through dashboards and reporting. My work spans from data ingestion and transformations to data modeling (Dimensional, Data Vault, Star) and modern architectures such as Medallion, Delta Lake, and lakehouse patterns.

Sheshanth Reddy

I am a Data Engineer with over 7 years of experience designing and implementing cloud-based and on-premises data solutions. I specialize in Databricks, PySpark, and Kafka, and have a proven track record migrating warehouses to cloud-native platforms, building scalable batch and streaming pipelines, and delivering BI-ready datasets for analytics and ML. I enjoy collaborating with platform and product teams to create data products, implementing governance and security, and enabling self-service analytics through dashboards and reporting. My work spans from data ingestion and transformations to data modeling (Dimensional, Data Vault, Star) and modern architectures such as Medallion, Delta Lake, and lakehouse patterns.

Available to hire

I am a Data Engineer with over 7 years of experience designing and implementing cloud-based and on-premises data solutions. I specialize in Databricks, PySpark, and Kafka, and have a proven track record migrating warehouses to cloud-native platforms, building scalable batch and streaming pipelines, and delivering BI-ready datasets for analytics and ML.

I enjoy collaborating with platform and product teams to create data products, implementing governance and security, and enabling self-service analytics through dashboards and reporting. My work spans from data ingestion and transformations to data modeling (Dimensional, Data Vault, Star) and modern architectures such as Medallion, Delta Lake, and lakehouse patterns.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate

Language

English
Fluent

Work Experience

Data Engineer at Atlas Air
July 1, 2023 - Present
Worked on on-premises BI solutions using SSIS, SSRS, and Power BI to build ETL pipelines, operational reports, and dashboards. Migrated to Azure Cloud, adopting Azure Databricks to modernize pipelines and process large-scale semi-structured and unstructured data. Designed and optimized workflows as the central platform for big data processing, Spark-based transformations, and ML-ready pipelines. Developed and deployed PySpark jobs for batch ingestion from APIs, flat files, and RDBMS sources. Streamed ingestion from Kafka into Cosmos DB and ADLS, handling high-velocity JSON/NoSQL data for low-latency analytics. Implemented Azure Key Vault, RBAC, and Managed Identities to safeguard sensitive datasets across production pipelines. Built automated pipelines for schema and data deployments using GitHub Actions and GraphQL Inspector, with Git as the version control system to manage and back up code. Delivered Power BI dashboards and reports by integrating with modernized cloud data sources, e
Data Engineer at Molina HealthCare
June 1, 2023 - October 3, 2025
Migrated on-premises databases and data warehouses into the Databricks Lakehouse on GCP using Google Cloud Storage (GCS) as the landing zone, establishing a scalable and cloud-native foundation. Designed and developed robust PySpark pipelines in Databricks for data ingestion, cleansing, enrichment, and transformation, producing high-quality reusable datasets. Implemented near real-time streaming pipelines using Kafka and Spark Structured Streaming, later refactored with GCP Pub/Sub and Databricks Structured Streaming for cloud-native scalability. Optimized Databricks performance and cost efficiency by right-sizing clusters, enabling autoscaling, and tuning Spark jobs for large-scale batch and streaming workloads. Automated ETL workflows with Apache Airflow DAGs, ensuring reliability and reducing manual interventions in production pipelines. Built a centralized logging and monitoring framework using ELK Stack and GCP Stackdriver, enabling proactive detection and faster resolution of fai
Big Data Developer at Atom IT Services
December 1, 2020 - October 3, 2025
Engineered Spark-Scala ETL workflows to migrate enterprise data from Oracle to MySQL, reducing reporting latency and improving downstream analytics. Developed a real-time Spark Streaming application integrated with Cassandra, enabling live sales performance dashboards for business stakeholders. Automated ingestion pipelines from heterogeneous sources (flat files, APIs, RDBMS) using Sqoop and Python, ensuring schema consistency and faster processing. Implemented Informatica PowerCenter and Informatica IDQ pipelines for data governance, data quality checks, and metadata management, strengthening regulatory compliance. Orchestrated ETL workflows with Apache Airflow and maintained version control with Git, ensuring reliability and traceability across deployments.
ETL Engineer at Dhruvsoft services
May 1, 2019 - October 3, 2025
Designed and developed SSIS and Informatica ETL packages to extract, transform, and load large datasets from heterogeneous sources into OLTP systems and Data Warehouses. Created incremental refresh strategies using partitioned tables and optimized stored procedures, improving ETL performance and reducing load times. Built advanced error handling and logging frameworks in SSIS and Informatica, improving reliability and reducing manual intervention during nightly jobs. Scheduled, monitored, and managed ETL workflows via SQL Server Agent, automating critical jobs and ensuring high availability. Wrote and optimized T-SQL queries, triggers, UDFs, and indexes, improving query execution for high-volume transactional and reporting systems. Developed and maintained SSRS reports, providing actionable insights and improving decision-making across business units.
Data Engineer at Molina HealthCare
August 1, 2021 - June 30, 2023
Migrated on-premises databases and data warehouses into the Databricks Lakehouse on GCP using Google Cloud Storage (GCS) as the landing zone. Designed and developed PySpark pipelines for data ingestion, cleansing, enrichment, and transformation. Implemented near real-time streaming pipelines using Kafka and Spark Structured Streaming, later refactored with GCP Pub/Sub for cloud-native scalability. Optimized Databricks performance and cost efficiency through right-sizing clusters and autoscaling. Automated ETL workflows with Apache Airflow, and built centralized logging/monitoring using ELK Stack and Stackdriver. Partnered with platform and product teams to deliver curated data products in Databricks Lakehouse for downstream consumers.
Big Data Developer at Atom IT Services
June 1, 2019 - December 31, 2020
Engineered Spark-Scala ETL workflows to migrate enterprise data from Oracle to MySQL, reducing reporting latency and improving downstream analytics. Developed a real-time Spark Streaming application integrated with Cassandra for live sales dashboards. Automated ingestion pipelines from heterogeneous sources (flat files, APIs, RDBMS) using Sqoop and Python. Implemented Informatica PowerCenter and Informatica IDQ pipelines for data governance and data quality checks. Orchestrated ETL workflows with Apache Airflow and maintained version control with Git.
ETL Engineer at Dhruvsoft Services
October 1, 2017 - May 31, 2019
Designed and developed SSIS and Informatica ETL packages to extract, transform, and load large datasets from heterogeneous sources into OLTP systems and Data Warehouses. Created incremental refresh strategies using partitioned tables and optimized stored procedures, improving ETL performance. Built advanced error handling and logging frameworks in SSIS and Informatica. Scheduled, monitored, and managed ETL workflows via SQL Server Agent. Wrote and optimized T-SQL queries, triggers, UDFs, and indexes, and developed SSRS reports.

Education

Bachelor of Technology in Computer Science at Jawaharlal Nehru Technological University, Hyderabad
January 11, 2030 - May 1, 2017
Bachelor of Technology in Computer Science at Jawaharlal Nehru Technological University, Hyderabad
January 11, 2030 - May 1, 2017

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services