I am Vishnu Varma, a data engineer with 5+ years of experience designing, building, and managing enterprise-scale data pipelines and analytics platforms. I specialize in ETL/ELT, data modeling, real-time processing, and dashboard delivery using a broad set of technologies across Hadoop, Spark, PySpark, SQL, Python, Airflow, Snowflake, Databricks, and cloud services on AWS, Azure, and GCP. I enjoy translating complex data into actionable insights and reliable data frameworks that scale with business needs. I have hands-on experience with streaming data, CI/CD automation, and delivering interactive dashboards for regulatory, risk, and enterprise reporting. I thrive in fast-paced environments and collaborating with cross-functional teams to drive data-driven decision-making.

Vishnu Varma

I am Vishnu Varma, a data engineer with 5+ years of experience designing, building, and managing enterprise-scale data pipelines and analytics platforms. I specialize in ETL/ELT, data modeling, real-time processing, and dashboard delivery using a broad set of technologies across Hadoop, Spark, PySpark, SQL, Python, Airflow, Snowflake, Databricks, and cloud services on AWS, Azure, and GCP. I enjoy translating complex data into actionable insights and reliable data frameworks that scale with business needs. I have hands-on experience with streaming data, CI/CD automation, and delivering interactive dashboards for regulatory, risk, and enterprise reporting. I thrive in fast-paced environments and collaborating with cross-functional teams to drive data-driven decision-making.

Available to hire

I am Vishnu Varma, a data engineer with 5+ years of experience designing, building, and managing enterprise-scale data pipelines and analytics platforms. I specialize in ETL/ELT, data modeling, real-time processing, and dashboard delivery using a broad set of technologies across Hadoop, Spark, PySpark, SQL, Python, Airflow, Snowflake, Databricks, and cloud services on AWS, Azure, and GCP. I enjoy translating complex data into actionable insights and reliable data frameworks that scale with business needs.

I have hands-on experience with streaming data, CI/CD automation, and delivering interactive dashboards for regulatory, risk, and enterprise reporting. I thrive in fast-paced environments and collaborating with cross-functional teams to drive data-driven decision-making.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Work Experience

Data Engineer at BNY Mellon
October 1, 2023 - October 31, 2025
Designed and managed enterprise-scale data lake and real-time analytics systems for global financial data. Developed Python-Spark applications to process data from RDBMS and streaming sources (Kafka, Kinesis). Configured Snowpipe to load real-time data from S3 into Snowflake with under 5-minute latency. Built scalable AWS Glue ETL pipelines to ingest, transform, and load high-volume structured and unstructured data into Redshift. Automated Glue ETL jobs ingesting 50M+ records/hour, reducing manual intervention by 80%. Used Spark Streaming APIs for real-time actions; stored results in DynamoDB and Snowflake. Integrated CodeStar and CodeCommit for version control; automated deployment with Jenkins and Ansible. Enabled micro-batching to ingest millions of files from S3 staging to Snowflake. Built Python scripts to process CSV, JSON, and Parquet files from S3 and store in DynamoDB/Snowflake. Monitored ETL jobs and user activity with CloudWatch and CloudTrail. Built Alteryx workflows and Ta
Data Engineer at Kogentix
December 1, 2022 - December 1, 2022
Built cloud-native data pipelines and analytics solutions using Azure Data Factory, Spark SQL, and Data Lake Analytics. Orchestrated Azure Data Factory and Databricks workflows processing 1.5 TB batch data and 200M streaming events/day. Migrated on-prem SQL Server data to Azure Synapse and Azure SQL DB, applying transformations with PySpark. Used Kafka and Cassandra for distributed data processing and streaming integration. Created REST APIs in ADF for seamless integration between systems. Stored structured/semi-structured data in Parquet/Avro formats to improve query performance. Automated workflows with ADF scheduling & triggers; integrated Git & CI/CD for DevOps practices. Delivered actionable insights using Power BI integrated with ADF pipelines. Developed reusable data products for schema validation, complex transformations, and multi-port outputs (ADLS/SQL). Migrated 5 TB of on-prem SQL Server data to Azure Synapse and Azure SQL DB, ahead of schedule by 30%. Imported data into Sy
Data Engineer / Production Support at Aurobindo Pharma
August 1, 2021 - August 1, 2021
Developed and supported data infrastructure using AWS, Spark, SQL Server, and ETL tools for pharma analytics. Migrated legacy media data to Wide Orbit using AWS Redshift and custom SQL mappings. Designed dimensional models (Kimball) with facts, dimensions, and referential constraints. Built SSIS ETL flows to move data from FTP and flat files to S3 and Redshift. Converted Informatica ETL to SSIS with dynamic control/script tasks. Created Tableau dashboards with parameters/actions and SSAS cubes with MDX calculations. Delivered Power BI reports using Power Pivot & Power View. Migrated 10 million+ legacy media records to AWS Redshift via SSIS, cutting nightly ETL runtimes by 60%. Slashed report generation time from 5 hours to 30 minutes with Python scripts. Cleaned, transformed, and validated over 75 million transaction and EHR records using SQL and Python, reducing downstream data-error rates from 4% to 0.5%.

Education

Masters in Data Analytics at Indiana Wesleyan University
January 11, 2030 - October 31, 2025

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Healthcare, Professional Services