I am a results-driven Data Engineer with expertise in building scalable, high-performance data pipelines leveraging cloud platforms such as AWS, GCP, and Azure. Skilled in real-time streaming architectures using Kafka, Spark, and Airflow, I am passionate about automating ETL workflows and ensuring data governance, security, and compliance. With significant experience optimizing query performance and managing containerized environments, I deliver actionable insights that drive business growth through innovative, data-driven solutions. Throughout my career, I have engineered and deployed real-time event-driven pipelines, modernized legacy batch ETL pipelines, and integrated hybrid cloud environments. I have collaborated closely with ML engineers, risk analytics, and BI teams to enable fraud detection, credit risk scoring, and operational reporting. My commitment to data quality, lineage, and audit readiness underpins enterprise-grade data solutions that meet strict regulatory standards like GDPR and SOX.

Pra Vallikamodu

I am a results-driven Data Engineer with expertise in building scalable, high-performance data pipelines leveraging cloud platforms such as AWS, GCP, and Azure. Skilled in real-time streaming architectures using Kafka, Spark, and Airflow, I am passionate about automating ETL workflows and ensuring data governance, security, and compliance. With significant experience optimizing query performance and managing containerized environments, I deliver actionable insights that drive business growth through innovative, data-driven solutions. Throughout my career, I have engineered and deployed real-time event-driven pipelines, modernized legacy batch ETL pipelines, and integrated hybrid cloud environments. I have collaborated closely with ML engineers, risk analytics, and BI teams to enable fraud detection, credit risk scoring, and operational reporting. My commitment to data quality, lineage, and audit readiness underpins enterprise-grade data solutions that meet strict regulatory standards like GDPR and SOX.

Available to hire

I am a results-driven Data Engineer with expertise in building scalable, high-performance data pipelines leveraging cloud platforms such as AWS, GCP, and Azure. Skilled in real-time streaming architectures using Kafka, Spark, and Airflow, I am passionate about automating ETL workflows and ensuring data governance, security, and compliance. With significant experience optimizing query performance and managing containerized environments, I deliver actionable insights that drive business growth through innovative, data-driven solutions.

Throughout my career, I have engineered and deployed real-time event-driven pipelines, modernized legacy batch ETL pipelines, and integrated hybrid cloud environments. I have collaborated closely with ML engineers, risk analytics, and BI teams to enable fraud detection, credit risk scoring, and operational reporting. My commitment to data quality, lineage, and audit readiness underpins enterprise-grade data solutions that meet strict regulatory standards like GDPR and SOX.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Work Experience

Data Engineer at Capital One
September 1, 2024 - Present
Engineered and deployed real-time, event-driven data pipelines using Apache Kafka, Spark Structured Streaming, and Apache Airflow on AWS and Azure, supporting ingestion of over 25 million daily customer, payment, and fraud events into Snowflake and Azure Synapse. Modernized legacy batch ETL pipelines by migrating to Databricks on Azure Data Lake Storage Gen2 and Delta Lake, implementing ACID-compliant streaming frameworks that reduced data latency by 50% and pipeline failures by 70%. Integrated internal APIs, S3, Azure Blob Storage, and third-party data vendors using AWS Glue, Azure Data Factory, and Lambda Functions to automate ingestion and transformation workflows across hybrid cloud environments. Designed high-performance data models and analytical layers in Snowflake and Synapse including clustering keys, dynamic partitioning, and materialized views to optimize analytical queries used by finance, credit policy, and compliance teams. Implemented Confluent Schema Registry and Kafka
Data Engineer at Accenture
June 1, 2023 - August 26, 2025
Designed and developed scalable, real-time data pipelines using Apache Kafka, Apache Spark, and Airflow to ingest and process financial data enabling real-time reporting and analytics in Snowflake for business stakeholders. Built ETL/ELT workflows using AWS Glue and Lambda for automating data integration from S3, RDS, and third-party APIs, improving processing efficiency by 30% and reducing manual intervention. Developed and maintained data pipelines using Databricks (PySpark), integrating AWS services such as EC2, S3, and Redshift to support analytics and operational reporting across business units. Improved query performance on Redshift by 25% through schema optimizations, materialized views, and tuning sort/distribution keys, enabling faster reporting and reduced load times. Collaborated with Data Science and BI teams to ensure data accuracy and consistency, delivering high-quality data sets for downstream analytics in Power BI and Tableau. Assisted in implementing data governance f
Data Engineer Intern at Informative Web Solutions
December 1, 2019 - August 26, 2025
Assisted in designing and implementing data pipelines using Python, SQL, and Apache Airflow to automate ETL of structured and semi-structured data from APIs, CSVs, and relational databases. Worked closely with senior data engineers to build and test scalable data ingestion frameworks using AWS S3, Redshift, and Glue supporting real-time and batch processing needs. Collaborated with cross-functional teams to understand data requirements and deliver clean, well-documented data sets for analytics and reporting teams. Participated in development of data models and schemas using Snowflake and PostgreSQL supporting BI tools like Tableau and Power BI for downstream analysis. Ensured data quality and reliability through validation scripts and unit testing using Pytest and Great Expectations. Maintained version control and deployment pipelines using Git, GitHub Actions, and basic CI/CD workflows under guidance. Documented pipeline architecture, data dictionaries, and workflow steps to support t

Education

Master in Data Science at University at Buffalo
August 1, 2023 - February 1, 2025
Bachelor of Technology in Computer Science at G. Narayanamma Institute of Technology and Science
June 1, 2016 - September 1, 2020

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Software & Internet, Professional Services, Education, Healthcare