I am a Senior Data Engineer with 6+ years of experience designing cloud-native data pipelines, integrating complex systems, and enabling real-time analytics across healthcare, finance, and enterprise platforms. I have hands-on experience migrating on-prem data to cloud data warehouses and building scalable architectures that offload analytics from legacy systems. My background in machine learning research and applied data science fuels my work developing classification algorithms and ML-driven use cases. I enjoy collaborating with cross-functional teams to deliver secure, interoperable solutions, including FHIR-compliant APIs for healthcare data reconciliation.

Nguyen Lang

I am a Senior Data Engineer with 6+ years of experience designing cloud-native data pipelines, integrating complex systems, and enabling real-time analytics across healthcare, finance, and enterprise platforms. I have hands-on experience migrating on-prem data to cloud data warehouses and building scalable architectures that offload analytics from legacy systems. My background in machine learning research and applied data science fuels my work developing classification algorithms and ML-driven use cases. I enjoy collaborating with cross-functional teams to deliver secure, interoperable solutions, including FHIR-compliant APIs for healthcare data reconciliation.

Available to hire

I am a Senior Data Engineer with 6+ years of experience designing cloud-native data pipelines, integrating complex systems, and enabling real-time analytics across healthcare, finance, and enterprise platforms. I have hands-on experience migrating on-prem data to cloud data warehouses and building scalable architectures that offload analytics from legacy systems.

My background in machine learning research and applied data science fuels my work developing classification algorithms and ML-driven use cases. I enjoy collaborating with cross-functional teams to deliver secure, interoperable solutions, including FHIR-compliant APIs for healthcare data reconciliation.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Data Engineering Specialist - Lead at Greenshield Canada
December 1, 2022 - Present
Led the design and implementation of end-to-end data pipelines using Apache Airflow and GCP Composer to migrate and transform on-prem data into Google BigQuery, enabling near real-time analytics and driving a 300% increase in customer conversion rates. Built scalable infrastructure to integrate data from Oracle (on-prem) and PostgreSQL (AWS, Azure) into BigQuery, fully offloading analytics from legacy systems. Delivered automated access provisioning and near real-time data availability for cross-functional teams (Finance, Analytics) with high accuracy and reliability, reducing operational friction. Developed secure, FHIR-compliant APIs to support healthcare data reconciliation, enabling interoperability between modern systems and legacy EMRs and reducing reliance on document-centric workflows. Reconciled data is obtained via CDC events delivered through Google Pub/Sub and persisted in the Spanner database.
Software Developer at IBM Canada
June 1, 2021 - December 1, 2022
Engineered a scalable ELT pipeline using Apache Airflow for a Canadian banking client to support fraud detection modeling. Wrote and integrated APIs using Python, automated software development with Bash, and configured DEV/TEST environments. Designed secure infrastructure patterns aligned with client needs and evolving business requirements. Conducted code reviews across multiple languages and mentored new developers and QA engineers to onboard into a fast-paced Agile environment.
Software Engineer at Marimetrics Technologies Inc.
May 1, 2020 - December 1, 2020
Developed real-time monitoring applications for oceanographic sensors, leveraging Matlab and Python to process continuous sensory data. Built an automatic platform acting as an overseer that analyzes periodically acquired sensory data 24/7 using Matlab. Used AWS S3 for data ingestion and retrieval; applied clustering and data mining to detect early signs of pathogen spread in aquaculture systems.
Data Science Intern at DeepSense
May 1, 2020 - September 1, 2020
Enhanced the scalability of data infrastructure for a startup accelerator by designing a flexible, modular database to support new data sources and analytical tools.
Graduate Research Assistant at Dalhousie University
September 1, 2018 - May 1, 2020
Developed algorithms to enhance text classification performance, feature selection analyzing on social network datasets. The model increases up to 10% classification accuracy compared to some state-of-the-art models. Performed feature analysis on encrypted data, investigated what kind of features we can analyze on encrypted texts without decryption.

Education

Master of Computer Science at Dalhousie University
September 1, 2018 - May 1, 2020
Bachelor of Engineering at Posts and Telecommunications Institute of Technology (PTIT)
January 1, 2012 - January 1, 2017

Qualifications

IBM Certified Cloud Advocate
January 1, 2022 - February 2, 2026
IBM Data Science Professional Certificate
January 1, 2020 - February 2, 2026
Second Place, Canada’s Cyber Security Challenge (Atlantic region)
January 11, 2030 - February 2, 2026
Third Prize, The 21 th Vietnam Mathematics Olympiad
January 11, 2030 - February 2, 2026

Industry Experience

Healthcare, Financial Services, Software & Internet, Professional Services