Hi there, I'm Naresh, a Senior Data Engineer with 8 years of experience building real-time streaming data platforms on AWS, Azure, and Databricks. I design scalable pipelines using Kafka, PySpark/Scala, and Delta Lake to deliver timely analytics, improve SLAs, and enable accurate ML-driven forecasting. I thrive in agile environments, own end-to-end solutions, and mentor teammates. I enjoy turning ambiguous data problems into repeatable, observable solutions that deliver measurable business outcomes.

Naresh

Hi there, I'm Naresh, a Senior Data Engineer with 8 years of experience building real-time streaming data platforms on AWS, Azure, and Databricks. I design scalable pipelines using Kafka, PySpark/Scala, and Delta Lake to deliver timely analytics, improve SLAs, and enable accurate ML-driven forecasting. I thrive in agile environments, own end-to-end solutions, and mentor teammates. I enjoy turning ambiguous data problems into repeatable, observable solutions that deliver measurable business outcomes.

Available to hire

Hi there, I’m Naresh, a Senior Data Engineer with 8 years of experience building real-time streaming data platforms on AWS, Azure, and Databricks. I design scalable pipelines using Kafka, PySpark/Scala, and Delta Lake to deliver timely analytics, improve SLAs, and enable accurate ML-driven forecasting.

I thrive in agile environments, own end-to-end solutions, and mentor teammates. I enjoy turning ambiguous data problems into repeatable, observable solutions that deliver measurable business outcomes.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate

Language

English
Advanced

Work Experience

Senior Data Engineer at 7-Eleven
July 1, 2024 - Present
Designed and implemented real-time data ingestion pipelines using Apache Kafka and AWS S3 to consolidate POS and IoT streams across 7-Eleven’s national network, delivering sub-minute data availability for inventory analytics. Built scalable ETL workflows in Apache Airflow to validate, cleanse, and enrich raw feeds, improving end-to-end data SLAs by 30%. Implemented Python-based feature computation (rolling 7-day sales velocity) and published features to a Feast store for low-latency ML scoring. Deployed containerized microservices on AWS ECS/Kubernetes and supported online forecasting models to reduce forecast error by 20%. Authored STTM mappings and established data lineage for auditability.
Data Engineer-Streaming & Real-Time at TC Energy
August 1, 2022 - May 1, 2024
Architected serverless real-time ingestion with Azure Event Hubs and IoT Hub for SCADA telemetry with sub-second latency. Built PySpark Structured Streaming pipelines on Databricks, enforcing schema validation and Delta Lake ACID on ADLS Gen2 to achieve 99.9% data accuracy. Orchestrated metadata-driven batch ETL in Azure Data Factory, loading historical maintenance logs into curated Delta Lake tables. Implemented anomaly detection with Spark MLlib and MLOps pipelines; optimized Delta layouts (Z-Order/partitioning) reducing ad-hoc query latency by 60%. Coordinated Autosys job flows to manage ADF and Databricks runs; integrated Purview for lineage.
Data Engineer-Fraud Detection at American Express (Amex)
November 1, 2020 - December 1, 2021
Contributed to cloud-native ingestion with Kafka (MSK) and Confluent Schema Registry; developed PySpark Structured Streaming jobs on Databricks to normalize and enrich streaming transaction data. Built batch ETL with Airflow and AWS Glue; integrated Feast feature store; supported end-to-end MLOps with Kubeflow/SageMaker. Optimized Delta Lake with Z-Order clustering; improved query performance by ~30%. Enhanced Oracle Exadata performance via compression & caching; achieved 40% faster heavy-report queries and sustained 10 TB/day throughput. Implemented Great Expectations checks and Prometheus/Grafana metrics; automated CI/CD with Terraform and GitHub Actions.
Data Engineer at Herbalife
June 1, 2018 - October 1, 2020
Set up Kafka topics and Confluent Schema Registry; built PySpark Structured Streaming jobs to normalize and enrich order and clickstream data. Supported nightly batch ETL with AWS Glue; loaded ERP orders and inventory into S3; maintained Delta Lake bronze. Orchestrated Autosys for 150+ ETL workflows with self-healing retries and SLA alerts. Built Informatica PowerCenter mappings; Feast for basic features; Monitored with Great Expectations; Terraform modules to provision MSK, S3, IAM.
Data Engineer at Amulya IT Solutions
June 1, 2017 - May 1, 2018
Assisted in setting up Kafka topics; PySpark transformations in Databricks; supported AWS Glue and EMR batch pipelines; wrote data cleansing scripts; orchestrated Autosys; built Informatica PowerCenter mappings; used Feast for basic features; contributed to Terraform modules; documented data flows.

Education

Add your educational history here.

Qualifications

Add your qualifications or awards here.

Industry Experience

Retail, Financial Services, Energy & Utilities, Software & Internet, Transportation & Logistics, Consumer Goods