I am a Sr. Data Engineer with 10+ years of experience designing, building, and optimizing enterprise-scale data platforms across GCP, AWS, and Azure. I specialize in data engineering, ETL/ELT pipelines, and data warehousing using Snowflake, BigQuery, Redshift, and Databricks to empower large-scale analytics. I am proficient in Python, PySpark, and SQL, and I design automated, high-performance data workflows that deliver analytics-ready pipelines while maintaining governance and security. I have hands-on experience with the Hadoop ecosystem, streaming architectures with Kafka and Spark Structured Streaming, and modern lakehouse patterns with Delta Lake. I routinely lead cloud migrations, implement data quality and observability with tools like Great Expectations, Unity Catalog, DataHub, and OpenLineage, and collaborate with cross-functional teams using Agile/Scrum. I enjoy building reliable, scalable, and compliant data ecosystems that enable BI, AI, and data-driven decision making.

Manish Reddy Addumamilla

I am a Sr. Data Engineer with 10+ years of experience designing, building, and optimizing enterprise-scale data platforms across GCP, AWS, and Azure. I specialize in data engineering, ETL/ELT pipelines, and data warehousing using Snowflake, BigQuery, Redshift, and Databricks to empower large-scale analytics. I am proficient in Python, PySpark, and SQL, and I design automated, high-performance data workflows that deliver analytics-ready pipelines while maintaining governance and security. I have hands-on experience with the Hadoop ecosystem, streaming architectures with Kafka and Spark Structured Streaming, and modern lakehouse patterns with Delta Lake. I routinely lead cloud migrations, implement data quality and observability with tools like Great Expectations, Unity Catalog, DataHub, and OpenLineage, and collaborate with cross-functional teams using Agile/Scrum. I enjoy building reliable, scalable, and compliant data ecosystems that enable BI, AI, and data-driven decision making.

Available to hire

I am a Sr. Data Engineer with 10+ years of experience designing, building, and optimizing enterprise-scale data platforms across GCP, AWS, and Azure. I specialize in data engineering, ETL/ELT pipelines, and data warehousing using Snowflake, BigQuery, Redshift, and Databricks to empower large-scale analytics. I am proficient in Python, PySpark, and SQL, and I design automated, high-performance data workflows that deliver analytics-ready pipelines while maintaining governance and security.

I have hands-on experience with the Hadoop ecosystem, streaming architectures with Kafka and Spark Structured Streaming, and modern lakehouse patterns with Delta Lake. I routinely lead cloud migrations, implement data quality and observability with tools like Great Expectations, Unity Catalog, DataHub, and OpenLineage, and collaborate with cross-functional teams using Agile/Scrum. I enjoy building reliable, scalable, and compliant data ecosystems that enable BI, AI, and data-driven decision making.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Language

English
Fluent

Work Experience

Senior GCP Data Engineer at Vanguard
March 1, 2024 - Present
Led and maintained BigQuery-based data pipelines processing 200+ TB of data, optimized query performance and storage costs through partitioning and clustering; engineered end-to-end ETL frameworks using Python, Apache Beam, and Spark; implemented Infrastructure as Code with Terraform to automate provisioning of GCP resources; led migration initiatives from legacy Hadoop to modern GCP architectures; developed reusable data-validation frameworks and serverless ETL workflows with Dataflow, Pub/Sub, and Cloud Functions; designed data marts in BigQuery with federated queries and materialized views; optimized Dataproc Spark jobs; implemented real-time streaming pipelines using Pub/Sub, Kafka, and Spark Streaming; established event-driven architectures and data governance with Unity Catalog and OpenLineage; built automated monitoring with Cloud Monitoring and Prometheus; integrated Airflow (Cloud Composer), dbt, and Matillion for automated transformations; delivered BI dashboards in Power BI,
Senior Data Engineer at Wells Fargo
December 1, 2022 - February 29, 2024
Delivered cloud-based data solutions in an Agile SCRUM environment; designed and implemented data migrations from on-premise to Azure using Azure Data Factory, Azure SQL DB, Synapse, and ADLS Gen2; built healthcare-focused Power BI reports for claims, provider, patient, and pharmacy data; leveraged PySpark in Azure Databricks with Delta Lake for batch and streaming data; integrated Databricks and ADF for orchestration/monitoring; exported Delta Lake data to Google Cloud Storage to enable BigQuery analysis; built pharmacy data marts for formulary management and drug cost optimization; implemented Unity Catalog-based governance across Synapse and Databricks with fine-grained access controls; created YAML-based DBT model documentation tooling and Looker validation scripts; enabled federated queries between BigQuery and Looker for interactive analytics; designed Tableau Prep flows for healthcare datasets; standardized medication/claims/provider reference data; linked Unity Catalog with Ala
AWS Data Engineer at Ebay
June 1, 2021 - November 1, 2022
Developed AWS data pipelines using Lambda, DynamoDB, S3, and API Gateway; created ETL procedures with AWS Glue to load campaign data into Redshift from S3; implemented PySpark-based ETL in Azure Databricks for semi-structured data; used Delta Lake for durable storage; built forecasting models in Snowflake and Power BI; created interactive dashboards; integrated Python wrappers for REST API communication; automated ML model retraining with AWS Lambda and Step Functions; expanded ML workflows with SageMaker and integrated outputs into Looker dashboards; built ML pipelines with AWS CodePipeline and Terraform; performed data governance with Unity Catalog; integrated data across cross-cloud environments.
Data Analyst at Verizon
January 1, 2019 - May 1, 2021
Performed data analysis and profiling to identify data quality issues; collaborated with business teams to translate requirements into data solutions; developed SQL queries to validate ETL results; built tables, views, indexes to support reporting; used AWS EC2 and S3 for data processing; designed data architectures integrating relational, NoSQL, and big data components; built dimensional models using Erwin; processed XML and flat-file data sources; implemented real-time XML messaging with Kafka and Spark Streaming; developed Tableau dashboards and Excel-based reports; performed performance tuning with T-SQL and indexing; documented data discrepancies and improved ETL coverage.
Hadoop Developer at Zylo IT Solutions
April 1, 2014 - April 1, 2017
Developed and optimized Spark-based data processing applications in PySpark/Scala; integrated Hadoop clusters with Kerberos, Ranger, Knox for security; configured HDFS, Hive, YARN; implemented data governance policies with Ranger and metadata lineage via TMM tools; collaborated with data quality teams to maintain enterprise data standards; built and validated ETL mappings using Talend and Informatica; deployed machine learning models using TensorFlow within big data pipelines; performed data profiling and cleansing; produced scorecards with Tableau/Excel and provided production support.

Education

Add your educational history here.

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more