I am a senior data engineer who designs and scales enterprise-grade data platforms across healthcare and financial services. I specialize in building cloud-native Lakehouse and data warehouse solutions on AWS, Azure, and GCP, handling multi-terabyte datasets and real-time streaming pipelines to empower data-driven decision making.\n\nI have a strong background in distributed data processing (Spark, Kafka, Hadoop), ELT/ETL orchestration (Airflow, DBT, Informatica), governance and HIPAA-compliant security, and the deployment of ML/GenAI-enabled data tooling. I enjoy mentoring teams, automating infrastructure with Terraform and CI/CD, and delivering secure, auditable analytics that enable business self-service while protecting PHI.

Rajin Shrestha

I am a senior data engineer who designs and scales enterprise-grade data platforms across healthcare and financial services. I specialize in building cloud-native Lakehouse and data warehouse solutions on AWS, Azure, and GCP, handling multi-terabyte datasets and real-time streaming pipelines to empower data-driven decision making.\n\nI have a strong background in distributed data processing (Spark, Kafka, Hadoop), ELT/ETL orchestration (Airflow, DBT, Informatica), governance and HIPAA-compliant security, and the deployment of ML/GenAI-enabled data tooling. I enjoy mentoring teams, automating infrastructure with Terraform and CI/CD, and delivering secure, auditable analytics that enable business self-service while protecting PHI.

Available to hire

I am a senior data engineer who designs and scales enterprise-grade data platforms across healthcare and financial services. I specialize in building cloud-native Lakehouse and data warehouse solutions on AWS, Azure, and GCP, handling multi-terabyte datasets and real-time streaming pipelines to empower data-driven decision making.\n\nI have a strong background in distributed data processing (Spark, Kafka, Hadoop), ELT/ETL orchestration (Airflow, DBT, Informatica), governance and HIPAA-compliant security, and the deployment of ML/GenAI-enabled data tooling. I enjoy mentoring teams, automating infrastructure with Terraform and CI/CD, and delivering secure, auditable analytics that enable business self-service while protecting PHI.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Data Engineer / Data Analyst at Abbott
August 1, 2023 - Present
Spearheaded the design and modernization of a global healthcare data platform by implementing a scalable AWS-based Lakehouse architecture (S3, EMR, Glue, Redshift, Lambda) integrated with Databricks and Snowflake, processing over 5TB/day of regulated clinical, manufacturing, supply-chain, and connected medical device data across multiple geographies. Architected domain-driven Data Mesh with federated governance, enabling cross-functional teams to publish governed data products aligned with HIPAA, FDA 21 CFR Part 11, and GDPR standards. Engineered distributed batch and streaming pipelines using PySpark, Spark-SQL, Apache Hudi, Kafka, and Spark Streaming, supporting ingestion of HL7, FHIR, EDI, and IoT telemetry data while maintaining ACID compliance and schema evolution control. Designed enterprise-grade ingestion frameworks leveraging CDC strategies, schema registry, and event-driven microservices architecture, ensuring reliable ingestion of PHI-sensitive datasets with full audit trace
Data Engineer at UHG
February 1, 2022 - July 1, 2023
Engineered large-scale data pipelines supporting enterprise healthcare payer systems, processing multi-terabyte datasets across claims, eligibility, provider, and pharmacy benefit management domains within a hybrid Hadoop and AWS cloud ecosystem. Designed distributed batch processing workflows using PySpark, Spark-SQL, Hive, and Hadoop (HDFS, Sqoop, Oozie) to transform EDI 837/835 claims transactions, HL7 feeds, and CMS encounter data into curated analytical datasets for actuarial and risk adjustment teams. Built and maintained scalable ingestion frameworks using Kafka and AWS MSK, enabling near real-time processing of claims adjudication events and member eligibility updates. Implemented dimensional and star-schema models in Snowflake and Teradata, supporting regulatory reporting, HEDIS quality metrics, STAR ratings analysis, and CMS-compliant financial reconciliation. Led cloud migration initiatives transitioning on-prem Hadoop workloads to AWS (S3, EMR, Glue, Redshift, Lambda), mode
Data Engineer at Lowe's
October 1, 2020 - January 31, 2022
Designed and developed enterprise-scale data pipelines processing 2TB+ daily POS, eCommerce, inventory, and supply chain datasets across 2,000+ retail stores and multiple regional distribution centers, supporting merchandising and logistics analytics. Engineered distributed batch processing workflows using Hadoop (HDFS, Hive, Sqoop, Oozie, MapReduce) and transitioned legacy SQL transformations into optimized PySpark and Spark-SQL pipelines, reducing peak-season batch processing windows by several hours. Built ingestion frameworks integrating structured and semi-structured data formats including JSON, Parquet, Avro, XML, and CSV, enabling scalable data lake ingestion and downstream reporting across pricing, promotions, and category management domains. Developed complex ETL workflows using Informatica PowerCenter, Talend, and DataStage, implementing SCD Type 1/2 logic, reconciliation controls, and audit-ready data transformations for enterprise warehouse systems. Designed dimensional and
Data Engineer at Lowes
October 1, 2020 - January 1, 2022
Designed and developed enterprise-scale data pipelines processing 2TB+ of daily POS, eCommerce, inventory, and supply chain datasets across 2,000+ retail stores and multiple regional distribution centers, supporting merchandising and logistics analytics. Engineered distributed batch processing workflows using Hadoop (HDFS, Hive, Sqoop, Oozie, MapReduce) and transitioned legacy SQL transformations into optimized PySpark and Spark-SQL pipelines, reducing peak-season batch processing windows by several hours. Built ingestion frameworks integrating structured and semi-structured data formats including JSON, Parquet, Avro, XML, and CSV, enabling scalable data lake ingestion and downstream reporting across pricing, promotions, and category management domains. Developed complex ETL workflows using Informatica PowerCenter, Talend, and DataStage, implementing SCD Type 1/2 logic, reconciliation controls, and audit-ready data transformations for enterprise warehouse systems. Designed dimensional

Education

Bachelor's in Information System at East Central University
January 11, 2030 - March 28, 2026
Bachelor's in Information System at East Central University
January 11, 2030 - March 28, 2026
Bachelor's in Information System at East Central University
January 11, 2030 - March 28, 2026

Qualifications

AWS Certified Data Engineer- Associate
January 11, 2030 - March 28, 2026
AWS Certified Data Engineer- Associate
January 11, 2030 - March 28, 2026
AWS Certified Data Engineer- Associate
January 11, 2030 - March 28, 2026

Industry Experience

Healthcare, Life Sciences, Software & Internet, Financial Services, Professional Services, Education, Retail