yes

Sai Ram Vulisetti

yes

Available to hire

yes

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Work Experience

Big Data Engineer at IBM/Truist Bank
November 1, 2024 - October 1, 2025
Built and operated large-scale data pipelines on AWS using S3, Glue, EMR, Spark, and Snowflake, processing 2–3 million records daily and improving analytics data readiness by nearly 30%. Developed batch and incremental ETL workflows using PySpark, Informatica IICS, and SSIS, integrating data across Oracle, Teradata, HDFS, and Snowflake while stabilizing Autosys job schedules. Designed scalable dimensional and relational data models in Snowflake and Redshift using star and snowflake schemas, improving dashboard query performance by 35% for business and risk analytics teams. Wrote optimized SQL, T-SQL, PL/SQL, and Python scripts to support high-volume loads of 50–80 GB per cycle with SQL Loader, Teradata FastLoad, and MultiLoad. Tuned Spark and Informatica workloads through partitioning, lookup optimization, and parallel processing, shrinking ETL processing windows by 25–30%. Implemented data quality checks, SCD/CDC logic, and monitoring using CloudWatch, keeping warehouse layers o
Data Engineer at Capital One
February 1, 2024 - October 1, 2024
Architected and maintained AWS-based data pipelines using Python, Spark, and Kafka to process 4–6 TB of daily transactions, delivering cleaner data to risk and fraud teams within minutes. Migrated ELT pipelines to Snowflake and S3 with Python and Spark, reducing manual prep time by 25% and saving 50+ hours weekly. Collaborated with data scientists and platform engineers to deploy ML-driven fraud features, contributing to 8–12% accuracy improvements. Implemented data profiling and quality checks across 300+ tables to detect schema changes and sensitive fields, supporting audits and AML compliance. Integrated lineage and alerting to improve reliability and reduce operational noise.
Data Engineer at Nanthealth
August 1, 2023 - November 1, 2023
Coordinated with DBAs to validate new tables and metadata across DB2, SQL Server, and Oracle, enabling smooth migrations to AWS Aurora and reducing release issues by 20%. Migrated legacy workloads to AWS by configuring EC2, S3, RDS, and Glue jobs; automated ingestion with Lambda, Kinesis, and SQS, cutting manual effort by ~35%. Built PySpark, DBT, and Snowflake pipelines (SnowPipe/Streams) for JSON, CSV, and Parquet files, improving daily loads by 30%. Designed and tuned Qlik and QuickSight dashboards, managing refreshes and addressing performance to cut load times ~25%. Developed Python scripts and API utilities to move data between S3 and SQL Server, and supported data modeling in ERwin.
Data Engineer at Amazon Development Centre
August 1, 2020 - December 1, 2021
Built and coordinated data pipelines across AWS Glue, EMR, Athena, Redshift, and S3, processing nearly 2–3 TB of daily data and boosting data availability for analytics teams by 30%. Created ELT workflows using dbt, PySpark, and SQL to load and transform datasets into clean dimensional models, reducing downstream report refresh times by 25–28%. Set up CI/CD automation with Jenkins, GitHub, Terraform, and Docker, reducing manual deployment effort by 40% and enabling pipeline changes in hours. Migrated older ETL infrastructure to cloud-based AWS services, resulting in 35% faster ETL processing and 15% cost reductions. Built Talend, Airflow, and PySpark jobs to load millions of records into S3 and Redshift, adding quality checks and CloudWatch monitoring that cut data issues by 15%.
SQL Developer at Byju’s Think & Learn Pvt Ltd
December 1, 2018 - May 1, 2020
Managed daily SQL and PL/SQL development for student, course, and content data used by analytics teams and 1,500+ internal users. Structured relational models and optimized queries, indexes, and stored procedures, reducing average report load times by 25–35%. Built ETL workflows with SSIS, Informatica IICS, and Teradata tools to move and validate several million records per day across Oracle, SQL Server, AWS, and HDFS. Automated recurring data loads and quality checks using SQL Loader, UNIX shell scripts, and Autosys, cutting manual intervention by 40% and improving refresh consistency. Initiated schema documentation, SCD logic, and CDC updates, collaborating with analysts and developers to resolve data gaps and improve KPI reporting by ~18%.

Education

Master of Science in Business Analytics at University of Texas at Dallas, Richardson, TX, USA
January 11, 2030 - December 1, 2023
Bachelor of Technology in Mechanical Engineering at VIT University, Vellore, Tamil Nadu, India
January 11, 2030 - May 1, 2017

Qualifications

AWS Certified Cloud Practitioner
January 11, 2030 - January 29, 2026

Industry Experience

Financial Services, Healthcare, Education, Professional Services, Software & Internet