Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am a data engineering professional with 8+ years of experience designing scalable data pipelines and ETL workflows using Azure, AWS, and Databricks. I enjoy collaborating across teams to translate complex requirements into reliable data solutions that enable dashboards and analytics. I am proficient in Python and SQL, with strong experience in big data technologies and cloud platforms. I am committed to delivering secure, high-performance data solutions, strong data governance, and actionable insights through reliable dashboards and analytics.…I am a data engineering professional with 8+ years of experience designing scalable data pipelines and ETL workflows using Azure, AWS, and Databricks. I enjoy collaborating across teams to translate complex requirements into reliable data solutions that enable dashboards and analytics. I am proficient in Python and SQL, with strong experience in big data technologies and cloud platforms. I am committed to delivering secure, high-performance data solutions, strong data governance, and actionable insights through reliable dashboards and analytics.

Sneha G

Data Scientist, Data Analyst, Full Stack Developer, +2





I am a data engineering professional with 8+ years of experience designing scalable data pipelines and ETL workflows using Azure, AWS, and Databricks. I enjoy collaborating across teams to translate complex requirements into reliable data solutions that enable dashboards and analytics. I am proficient in Python and SQL, with strong experience in big data technologies and cloud platforms. I am committed to delivering secure, high-performance data solutions, strong data governance, and actionable insights through reliable dashboards and analytics.…I am a data engineering professional with 8+ years of experience designing scalable data pipelines and ETL workflows using Azure, AWS, and Databricks. I enjoy collaborating across teams to translate complex requirements into reliable data solutions that enable dashboards and analytics. I am proficient in Python and SQL, with strong experience in big data technologies and cloud platforms. I am committed to delivering secure, high-performance data solutions, strong data governance, and actionable insights through reliable dashboards and analytics.

Available to hire

I am a data engineering professional with 8+ years of experience designing scalable data pipelines and ETL workflows using Azure, AWS, and Databricks. I enjoy collaborating across teams to translate complex requirements into reliable data solutions that enable dashboards and analytics.

I am proficient in Python and SQL, with strong experience in big data technologies and cloud platforms. I am committed to delivering secure, high-performance data solutions, strong data governance, and actionable insights through reliable dashboards and analytics.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Intermediate

Work Experience

Data Engineer at Proinfy Solutions

December 1, 2024 - Present

Analyzed, designed, and built modern data solutions using Microsoft Azure PAAS services to support data visualization and enterprise reporting. Engineered Azure Data Factory pipelines with linked services and datasets to extract, transform, and load data from diverse sources including Azure SQL and Blob Storage. Developed and maintained scalable data pipelines utilizing Databricks notebooks and workflows. Orchestrated data ingestion into multiple Azure services (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processed data in Azure Databricks, ensuring efficient ETL processes. Implemented parameterized and dynamic ADF pipelines to automate multi-source data ingestion, improving operational efficiency and data reliability. Constructed reusable ADF templates and parameterized components to streamline future migration projects. Configured integrated solutions with ADF pipelines, Azure Key Vault, Databricks, SQL DB, and integration runtimes to ensure robust security and data comp

Big Data Developer at Proinfy Solutions

March 1, 2020 - April 1, 2024

Designed and developed data pipelines and ETL workflows using Hadoop, Spark, and cloud platforms (AWS, Azure). Migrated legacy MapReduce jobs into Spark/Scala transformations to support large-scale batch data processing. Implemented data ingestion workflows using Azure Data Factory (ADF) and managed datasets in Azure Data Lake Storage (ADLS). Built and optimized data warehouses by integrating structured and unstructured data from relational databases (Oracle, MySQL) into Hadoop ecosystems. Performed Hive query optimization with partitioning, bucketing, and dynamic partitioning for efficient data retrieval. Automated data validation and transformation processes to ensure accuracy and availability for downstream analytics and BI platforms. Worked with multiple file formats (JSON, XML, Avro, Parquet) and applied compression techniques (Snappy) for storage optimization. Collaborated with architects and business stakeholders to translate requirements into scalable, production-ready data sol

Big Data Consultant at Charter Communications

April 1, 2019 - January 1, 2020

Collaborated in developing technical requirements, design specifications, and software solutions using Scrum Agile methodology to meet client data engineering needs. Coordinated with project managers, business owners, analysts, and clients to build database prototypes, document code, and provide regular progress reports. Operated Hadoop clusters and used Hive for efficient storage and retrieval of enterprise data. Migrated large datasets from Netezza to Teradata using Sqoop, ensuring data integrity and efficient processing. Developed data pipelines for ingesting, aggregating, and loading consumer response data into Hive external tables in HDFS, supporting Tableau dashboard reporting. Performed query analysis and contributed to coding initiatives using Hadoop, Spark, Scala, Hive, and HBase. Supported the project team in delivering critical client business requirements during implementation phases.

Data Consultant – Big Data at Change Healthcare

November 1, 2018 - March 1, 2019

Collected Members, Providers, and claims data and identified PHI from various SQL Servers and ingested into Hadoop Distributed File System. Worked on Hadoop clusters and Hive for data storage and retrieval. Developed Spark SQL queries for analysts via Zeppelin and implemented Spark using Scala and Spark SQL for faster testing and processing. Set up environments on AWS EC2 for computing, using S3 for storage, and configured EMR clusters. Imported data from AWS S3 into Spark RDD, performed transformations and actions on RDDs, and administered the cluster with memory tuning. Implemented real-time data ingestion from S3 via Spark Streaming. Employed Sqoop for large data transfers from RDBMS to S3. Worked with Spark ecosystem using SparkSQL and Scala across formats like Text, CSV, and Parquet. Used Zeppelin for analytics and visualization.

Data Consultant - Hadoop at Global Atlantic Financial Group

June 1, 2017 - October 1, 2018

Coordinated with business customers to gather requirements and delivered BRD/TDD. Analyzed Hadoop clusters with Hive, HBase, and Sqoop. Implemented Spark using Scala and Spark SQL. Installed and configured Hadoop clusters and related tools. Developed Sqoop scripts to migrate data from source systems to the big data environment. Worked on a 30-node cluster and expanded capacity. Imported data from MySQL and Oracle to HDFS using Sqoop; loaded Linux file systems data to HDFS and processed with Hive. Exported data from MySQL, IBM DB2, and Oracle to HDFS. Transformed files with Hive into Parquet formats. Authored Avro schemas, and loaded data into Hive tables. Optimized performance and monitored systems and logs. Used Oozie, Control-M, and Autosys for workflow management, and Tableau for daily reporting.

Software Trainee at Achieva IT Inc

January 1, 2017 - May 1, 2017

Created users in MySQL, Oracle, SQL Server; installed databases; monitored logs; loaded data to and from HDFS; developed Hive queries for data analysis and SAP BO reporting.