Available to hire
Hi there, I’m Naresh, a Senior Data Engineer with 8 years of experience building real-time streaming data platforms on AWS, Azure, and Databricks. I design scalable pipelines using Kafka, PySpark/Scala, and Delta Lake to deliver timely analytics, improve SLAs, and enable accurate ML-driven forecasting.
I thrive in agile environments, own end-to-end solutions, and mentor teammates. I enjoy turning ambiguous data problems into repeatable, observable solutions that deliver measurable business outcomes.
Skills
Language
English
Advanced
Work Experience
Senior Data Engineer at 7-Eleven
July 1, 2024 - PresentDesigned and implemented real-time data ingestion pipelines using Apache Kafka and AWS S3 to consolidate POS and IoT streams across 7-Eleven’s national network, delivering sub-minute data availability for inventory analytics. Built scalable ETL workflows in Apache Airflow to validate, cleanse, and enrich raw feeds, improving end-to-end data SLAs by 30%. Implemented Python-based feature computation (rolling 7-day sales velocity) and published features to a Feast store for low-latency ML scoring. Deployed containerized microservices on AWS ECS/Kubernetes and supported online forecasting models to reduce forecast error by 20%. Authored STTM mappings and established data lineage for auditability.
Data Engineer-Streaming & Real-Time at TC Energy
August 1, 2022 - May 1, 2024Architected serverless real-time ingestion with Azure Event Hubs and IoT Hub for SCADA telemetry with sub-second latency. Built PySpark Structured Streaming pipelines on Databricks, enforcing schema validation and Delta Lake ACID on ADLS Gen2 to achieve 99.9% data accuracy. Orchestrated metadata-driven batch ETL in Azure Data Factory, loading historical maintenance logs into curated Delta Lake tables. Implemented anomaly detection with Spark MLlib and MLOps pipelines; optimized Delta layouts (Z-Order/partitioning) reducing ad-hoc query latency by 60%. Coordinated Autosys job flows to manage ADF and Databricks runs; integrated Purview for lineage.
Data Engineer-Fraud Detection at American Express (Amex)
November 1, 2020 - December 1, 2021Contributed to cloud-native ingestion with Kafka (MSK) and Confluent Schema Registry; developed PySpark Structured Streaming jobs on Databricks to normalize and enrich streaming transaction data. Built batch ETL with Airflow and AWS Glue; integrated Feast feature store; supported end-to-end MLOps with Kubeflow/SageMaker. Optimized Delta Lake with Z-Order clustering; improved query performance by ~30%. Enhanced Oracle Exadata performance via compression & caching; achieved 40% faster heavy-report queries and sustained 10 TB/day throughput. Implemented Great Expectations checks and Prometheus/Grafana metrics; automated CI/CD with Terraform and GitHub Actions.
Data Engineer at Herbalife
June 1, 2018 - October 1, 2020Set up Kafka topics and Confluent Schema Registry; built PySpark Structured Streaming jobs to normalize and enrich order and clickstream data. Supported nightly batch ETL with AWS Glue; loaded ERP orders and inventory into S3; maintained Delta Lake bronze. Orchestrated Autosys for 150+ ETL workflows with self-healing retries and SLA alerts. Built Informatica PowerCenter mappings; Feast for basic features; Monitored with Great Expectations; Terraform modules to provision MSK, S3, IAM.
Data Engineer at Amulya IT Solutions
June 1, 2017 - May 1, 2018Assisted in setting up Kafka topics; PySpark transformations in Databricks; supported AWS Glue and EMR batch pipelines; wrote data cleansing scripts; orchestrated Autosys; built Informatica PowerCenter mappings; used Feast for basic features; contributed to Terraform modules; documented data flows.
Education
Qualifications
Industry Experience
Retail, Financial Services, Energy & Utilities, Software & Internet, Transportation & Logistics, Consumer Goods
Skills
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Irving today.