I am a Data Engineer with around 5 years of experience in designing, developing, and managing data-intensive applications using Big Data ecosystems, AWS, and Scala-based frameworks. I'm passionate about building scalable data pipelines and creating interactive data visualizations that empower data-driven decision-making. I enjoy solving complex data challenges and continuously learning new technologies to keep my skills sharp. Throughout my career, I have worked on a variety of projects involving Spark, Hadoop, AWS, and Azure technologies. I have hands-on experience with real-time data streaming, cloud infrastructure automation, and data migration. I thrive in collaborative, Agile environments where I can contribute to building efficient and modular systems that meet business needs.

Waseemakram Shaik

I am a Data Engineer with around 5 years of experience in designing, developing, and managing data-intensive applications using Big Data ecosystems, AWS, and Scala-based frameworks. I'm passionate about building scalable data pipelines and creating interactive data visualizations that empower data-driven decision-making. I enjoy solving complex data challenges and continuously learning new technologies to keep my skills sharp. Throughout my career, I have worked on a variety of projects involving Spark, Hadoop, AWS, and Azure technologies. I have hands-on experience with real-time data streaming, cloud infrastructure automation, and data migration. I thrive in collaborative, Agile environments where I can contribute to building efficient and modular systems that meet business needs.

Available to hire

I am a Data Engineer with around 5 years of experience in designing, developing, and managing data-intensive applications using Big Data ecosystems, AWS, and Scala-based frameworks. I’m passionate about building scalable data pipelines and creating interactive data visualizations that empower data-driven decision-making. I enjoy solving complex data challenges and continuously learning new technologies to keep my skills sharp.

Throughout my career, I have worked on a variety of projects involving Spark, Hadoop, AWS, and Azure technologies. I have hands-on experience with real-time data streaming, cloud infrastructure automation, and data migration. I thrive in collaborative, Agile environments where I can contribute to building efficient and modular systems that meet business needs.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert

Work Experience

Big Data/AWS engineer at PNC Bank
July 1, 2023 - Present
Gathered business and technical requirements to design end-to-end data processing solutions in Hadoop and AWS. Developed and orchestrated data pipelines using PySpark, Scala, and Spark SQL for processing large datasets stored in S3 and HDFS. Designed and maintained Glue-based ETL jobs to load data into Redshift. Automated EMR cluster provisioning with Boto3 and Terraform. Built data ingestion frameworks using Sqoop and Flume. Converted JSON data to Parquet and registered outputs into Hive metastore. Developed dashboards using Tableau and Python. Built data quality scripts in SQL and Hive QL. Collaborated on migrating Hadoop workloads to AWS EMR. Containerized microservices and deployed via Kubernetes on EKS. Developed CI/CD pipelines with Git and Jenkins. Involved in HBase setup and real-time processing using Spark Streaming and Kafka. Developed workflows in Oozie for automation. Executed data warehouse migration from Oracle to Redshift.
Big Data Engineer at Cardinal Health
June 30, 2023 - August 22, 2025
Developed Spark RDD and Spark Streaming jobs for batch and real-time processing using Scala and Spark SQL APIs. Engineered data pipelines using Sqoop and Flume for ingesting data into Hadoop HDFS. Created Hive tables and scheduled ETL pipelines via Oozie. Performed data migration from Cloudera Hadoop clusters to AWS EMR, enhancing scalability and reducing costs. Built RESTful APIs with Play framework. Implemented schema comparisons between HBase and Hive. Integrated Kafka for real-time message ingestion. Automated data extractions from SQL Server using Python and Spark. Developed complex multi-step Spark workflows for customer data enrichment and feature extraction. Upgraded Hadoop Cluster and worked on high availability setup.
Hadoop Developer at Amigos software solutions
November 30, 2021 - August 22, 2025
Participated in SDLC phases designing and developing web-tier components using Spring MVC. Implemented RESTful services and integrated with front-end Struts components. Used Hive, Pig, and MapReduce to analyze HDFS data. Built message-based services using JMS and connected legacy systems with Java middleware. Used Hibernate ORM and JDBC for Oracle database connectivity. Monitored job performance using Cloudera Manager. Developed web services using SOAP and REST. Wrote stored procedures, views, UDFs, and triggers for SQL Server. Developed test cases using JUnit framework and deployed applications on WebLogic Server.

Education

Add your educational history here.

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Healthcare, Software & Internet, Professional Services