I'm Narayan Reddy, a Senior Data Engineer with around 5 years of experience working with AWS and Azure cloud platforms, specializing in big data engineering using Spark, PySpark, and Python. I design and build scalable ETL pipelines and real-time data workflows leveraging technologies like Kafka, Flink, AWS Glue, Azure Data Factory, and Databricks. My expertise includes creating data lakes on Amazon S3 and Azure Data Lake, optimizing Spark jobs, and automating workflows with Apache Airflow and AWS Step Functions. I am skilled in managing Hadoop clusters, migrating legacy systems to the cloud, and ensuring data quality and security. I enjoy tackling complex data challenges and delivering robust, production-ready solutions that enable analytics and business insights. Collaborating with cross-functional teams, I design data products and analytics-ready pipelines, migrate on-premise workloads to the cloud, and implement governance, data quality, and security controls. I thrive in fast-paced environments and enjoy delivering scalable, production-ready solutions that empower analytics and business insights.

Narayan Reddy

I'm Narayan Reddy, a Senior Data Engineer with around 5 years of experience working with AWS and Azure cloud platforms, specializing in big data engineering using Spark, PySpark, and Python. I design and build scalable ETL pipelines and real-time data workflows leveraging technologies like Kafka, Flink, AWS Glue, Azure Data Factory, and Databricks. My expertise includes creating data lakes on Amazon S3 and Azure Data Lake, optimizing Spark jobs, and automating workflows with Apache Airflow and AWS Step Functions. I am skilled in managing Hadoop clusters, migrating legacy systems to the cloud, and ensuring data quality and security. I enjoy tackling complex data challenges and delivering robust, production-ready solutions that enable analytics and business insights. Collaborating with cross-functional teams, I design data products and analytics-ready pipelines, migrate on-premise workloads to the cloud, and implement governance, data quality, and security controls. I thrive in fast-paced environments and enjoy delivering scalable, production-ready solutions that empower analytics and business insights.

Available to hire

I’m Narayan Reddy, a Senior Data Engineer with around 5 years of experience working with AWS and Azure cloud platforms, specializing in big data engineering using Spark, PySpark, and Python. I design and build scalable ETL pipelines and real-time data workflows leveraging technologies like Kafka, Flink, AWS Glue, Azure Data Factory, and Databricks. My expertise includes creating data lakes on Amazon S3 and Azure Data Lake, optimizing Spark jobs, and automating workflows with Apache Airflow and AWS Step Functions. I am skilled in managing Hadoop clusters, migrating legacy systems to the cloud, and ensuring data quality and security. I enjoy tackling complex data challenges and delivering robust, production-ready solutions that enable analytics and business insights.

Collaborating with cross-functional teams, I design data products and analytics-ready pipelines, migrate on-premise workloads to the cloud, and implement governance, data quality, and security controls. I thrive in fast-paced environments and enjoy delivering scalable, production-ready solutions that empower analytics and business insights.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Language

English
Fluent

Work Experience

Sr Java Full Stack Developer at Walmart
April 1, 2023 - October 31, 2025
Developed and deployed Spring Boot microservices with Spring MVC, JPA/Hibernate, and DAO layers for scalable enterprise applications. Designed and implemented RESTful APIs for SPAs and backend systems. Secured APIs with Spring Security (JWT). Integrated SAP Services via APIGEE for load balancing and secure communication. Implemented Big Data pipelines with Hadoop, Spark, Flink, and Storm for real-time and batch processing. Built persistence layers using Spring JPA, Hibernate, MongoDB, PostgreSQL, and Hadoop ecosystem. Designed ETL processes and automated data ingestion using Spring, Hadoop, and Python REST templates. Deployed microservice-based cloud architecture on AWS and Cloud Foundry. Configured centralized configuration via Spring Config Server. Set up Splunk for centralized logging and monitoring. Automated CI/CD pipelines with GitHub, Jenkins, and Docker; packaged and deployed EARs. Documented REST services with Swagger/OpenAPI. Followed Agile methodologies with JIRA.
Full Stack Java Developer at Scotia Bank
March 1, 2023 - March 1, 2023
Created interactive web applications using HTML, CSS, JavaScript, and React.js with dynamic UI components. Implemented REST APIs to fetch and transform data for UI display. Designed a customer callback REST API and integrated Angular Router features for multi-view navigation. Built backend services with Spring Framework (Spring MVC, Spring Boot) and DAO modules. Developed Microservices with Spring Boot and asynchronous communication via Amazon SQS. Integrated SOAP/WSDL web services. Containerized with Docker Compose and managed deployments using Kubernetes. Implemented CI/CD with Jenkins, Docker, and AWS. Built persistence with Hibernate and worked with MongoDB and Oracle. Engaged with Big Data tools like Hadoop, Spark, Flink, and Storm for distributed processing. Implemented logging with Log4j and maintained builds with Maven. Collaborated in an Agile/Scrum environment using JIRA.
Java Developer at Atlassian
January 1, 2022 - January 1, 2022
Developed front-end and back-end Java applications, created Spring-based microservices, and migrated legacy systems to modern Spring Boot architectures. Built REST APIs and integrated with external services. Used Jira for Agile planning and collaboration.
Java Developer at Mahendra Finance
December 1, 2017 - December 1, 2017
Designed and developed persistence layers with Hibernate mapped to Oracle databases. Built multi-threaded Java applications for high-performance requirements. Implemented core J2EE components (Servlets, JSP, EJB) and DAO patterns. Developed UI with HTML/CSS/JS, implemented SOAP-based web services, and authored PL/SQL stored procedures/triggers. Created and tested Java-based modules using JUnit; managed builds with Maven.
Big Data Developer at J.D. Power
May 1, 2024 - October 31, 2025
Collaborated with Business Analysts and DBAs to gather requirements, support testing and project coordination for the Claims DataMart. Performed data manipulation, cleansing, and profiling on source systems to ensure data readiness and quality. Wrote complex SQL for data extraction/loading, designed proof-of-concept Hadoop-based applications, and migrated on-premises data to AWS Redshift via Glue and S3. Built high-performance ETL pipelines using PySpark and Spark SQL on Azure HDInsight; exposed RESTful APIs with Spring Boot; integrated backend services with dashboards for real-time monitoring. Optimized Spark performance with caching and partitioning; designed data products and analytics-ready pipelines; explored GCP feasibility for global pipelines; implemented data security and IAM controls.
Data Engineer at DXC Technology
December 31, 2022 - December 31, 2022
Developed and maintained Spark pipelines ingesting sensor data from Apache Kafka for predictive maintenance; implemented Delta Lake on Azure Databricks to ensure ACID transactions, schema enforcement, and time travel. Served ML models in live pipelines; built streaming using Spark Streaming; deployed microservices via Spring Boot to trigger Spark jobs on Databricks. Designed backend APIs for results retrieval, and integrated with Azure DevOps for CI/CD. Migrated on-prem ETL pipelines to Azure Data Factory and Databricks; created end-to-end CI/CD, and deployed notebooks on AKS. Used ELK stack for log debugging, and visualized outcomes in Power BI/Tableau. Implemented Data Lake governance using ADLS Gen2 RBAC, Key Vault, and security practices; supported cloud migration from on-prem to AWS with Boto3 and Terraform (in progress).
Data Engineer at Allica Bank, India
June 30, 2020 - June 30, 2020
Designed and developed scalable data pipelines using Apache Spark, Hive, Impala, and HBase to ingest and process customer behavioral and financial data. Built batch and real-time ingestion pipelines with Kafka, Sqoop, and NiFi; automated daily loads with Oozie; migrated Hive/SQL pipelines to Azure Synapse for near real-time reporting. Implemented ML pipelines in PySpark and Python; migrated on-prem Hive/SQL transformations to Spark DataFrames; created Java-based ETL utilities; managed data governance and security; developed API-based data services and dashboards with Power BI/Tableau; automated Terraform provisioning and CI/CD with GitHub Actions. Migrated data landscapes to AWS S3 and Redshift; integrated ADLS Gen2 with ADF/Databricks; containerized microservices on AKS.
Big Data Developer / Data Engineer at J.D. Power
May 1, 2024 - November 21, 2025
Collaborated with Business Analysts and DBAs to gather requirements and support business analysis, testing, and project coordination for Claims DataMart. Performed data cleansing and profiling on source systems to ensure data readiness and quality. Analyzed raw datasets, wrote complex SQL scripts for data extraction/loading, and generated insights to support reporting and analytics. Migrated on-premises databases to AWS Redshift using AWS Glue, S3, and related services. Built high-performance ETL pipelines using PySpark and Spark SQL on Azure HDInsight; developed and exposed RESTful APIs using Spring Boot and Java to serve data pipelines and dashboards. Optimized Spark job performance, implemented data validation frameworks, and designed monitoring dashboards for end-to-end visibility.
Data Engineer at DXC Technology
December 1, 2022 - December 1, 2022
Developed Spark jobs to ingest sensor data from Apache Kafka, applied machine learning models, and persisted predictions in Cassandra for downstream analytics. Built and maintained Spark pipelines with multiple sources and sinks to support predictive maintenance. Streamed real-time data from IoT devices using Kafka; processed events with Spark Streaming and stored results. Leveraged Delta Lake on Azure Databricks to ensure ACID transactions and schema enforcement. Created REST APIs to trigger Spark jobs, integrated with CI/CD pipelines, and deployed microservices on AKS. Migrated on-premise ETL pipelines to Azure Data Factory and Databricks; tuned Spark jobs for better performance and reliability.
Data Engineer at Allica Bank, India
June 1, 2020 - June 1, 2020
Designed scalable data pipelines using Spark, Hive, Impala, and HBase to process customer behavioral and financial data. Built batch and real-time ingestion pipelines using NiFi, Kafka, Sqoop, and Spark. Migrated Hive/SQL pipelines to Azure Synapse Analytics for near real-time reporting. Implemented Oozie-driven ETL workflows and integrated with Terraform for infrastructure provisioning. Collaborated with data science teams to productionize ML models in Databricks and exposed API-based data services for dashboards. Implemented data governance practices, metadata management, and quality controls to ensure reliable analytics.

Education

Bachelor of Engineering – Computer Science at SVCE College, Bangalore
January 1, 2012 - January 1, 2016
Post Graduation Certificate at Fanshawe College
January 11, 2030 - January 1, 2024
Bachelor of Technology at GITAM University, Hyderabad
January 11, 2030 - January 1, 2020
Post Graduation Certificate at Fanshawe College, London
January 1, 2024 - January 1, 2024
Bachelor of Technology, Computer Science at GITAM University, Hyderabad
January 1, 2020 - January 1, 2020

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services, Financial Services, Education