Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am a data engineer and AI/LLM specialist with 6 years of experience designing, optimizing, and deploying large-scale data systems, real-time streaming architectures, and AI-driven solutions. I have built production-grade data pipelines handling petabyte-scale data, engineered end-to-end pipelines with Spark, Kafka, Hadoop, and SQL, CI/CD and led cloud-native infrastructure improvements that cut processing time and costs. I enjoy turning complex data challenges into scalable, business-impacting solutions.…I am a data engineer and AI/LLM specialist with 6 years of experience designing, optimizing, and deploying large-scale data systems, real-time streaming architectures, and AI-driven solutions. I have built production-grade data pipelines handling petabyte-scale data, engineered end-to-end pipelines with Spark, Kafka, Hadoop, and SQL, CI/CD and led cloud-native infrastructure improvements that cut processing time and costs. I enjoy turning complex data challenges into scalable, business-impacting solutions.

Ritwik Raj

AI Engineer, AI Strategy Consultant, Back-End Developer, +4





I am a data engineer and AI/LLM specialist with 6 years of experience designing, optimizing, and deploying large-scale data systems, real-time streaming architectures, and AI-driven solutions. I have built production-grade data pipelines handling petabyte-scale data, engineered end-to-end pipelines with Spark, Kafka, Hadoop, and SQL, CI/CD and led cloud-native infrastructure improvements that cut processing time and costs. I enjoy turning complex data challenges into scalable, business-impacting solutions.…I am a data engineer and AI/LLM specialist with 6 years of experience designing, optimizing, and deploying large-scale data systems, real-time streaming architectures, and AI-driven solutions. I have built production-grade data pipelines handling petabyte-scale data, engineered end-to-end pipelines with Spark, Kafka, Hadoop, and SQL, CI/CD and led cloud-native infrastructure improvements that cut processing time and costs. I enjoy turning complex data challenges into scalable, business-impacting solutions.

Available to hire

I am a data engineer and AI/LLM specialist with 6 years of experience designing, optimizing, and deploying large-scale data systems, real-time streaming architectures, and AI-driven solutions.

I have built production-grade data pipelines handling petabyte-scale data, engineered end-to-end pipelines with Spark, Kafka, Hadoop, and SQL, CI/CD and led cloud-native infrastructure improvements that cut processing time and costs. I enjoy turning complex data challenges into scalable, business-impacting solutions.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Intermediate

Intermediate

Language

English

Advanced

Work Experience

Senior Data Engineer II at Apple

May 1, 2025 - May 1, 2025

Designed, implemented, and optimized production-grade data pipelines and real-time streaming architectures to support AI-driven analytics. Led end-to-end data processing with Kafka and Spark Streaming; integrated Apache Iceberg for scalable storage; leveraged AWS Glue for ETL; improved data reliability and system resilience. Drove cloud-native architecture and MLOps readiness, enabling scalable data processing on petabyte-scale datasets and close collaboration with product teams.

Software Engineer - Data Platform at Ola Cabs

November 1, 2022 - November 1, 2022

Built microservices to transfer real-time data from Kafka and MySQL to a data lake (S3), deployed on Kubernetes for scalable and efficient data storage. Led a proof-of-concept to deploy Apache Pinot and Trino on a Kubernetes cluster, enabling sub-second query performance on high-throughput Kafka topic data and reducing analytics latency by 50%.

Cloud Data Engineer at Amazon Web Services

July 1, 2021 - July 1, 2021

Developed EMR-based data processing pipelines and debugging tools to efficiently analyze and troubleshoot jobs running on EMR clusters; reduced debugging time and improved system reliability. Leveraged AWS services (DMS, S3, EMR, Glue, Redshift, Athena) to streamline big data processing and built end-to-end ETL workflows in collaboration with data science teams.