I help companies turn messy data and complex ideas into reliable, scalable AI solutions. With experience building large distributed pipelines, training LLMs, and developing retrieval and knowledge-graph systems, I deliver practical ML tools that solve real business problems. Whether you need model development, data engineering, or intelligent automation, I can take your project from concept to production.

I help companies turn messy data and complex ideas into reliable, scalable AI solutions. With experience building large distributed pipelines, training LLMs, and developing retrieval and knowledge-graph systems, I deliver practical ML tools that solve real business problems. Whether you need model development, data engineering, or intelligent automation, I can take your project from concept to production.

Available to hire

I help companies turn messy data and complex ideas into reliable, scalable AI solutions. With experience building large distributed pipelines, training LLMs, and developing retrieval and knowledge-graph systems, I deliver practical ML tools that solve real business problems. Whether you need model development, data engineering, or intelligent automation, I can take your project from concept to production.

See more

Experience Level

Expert
Expert
Expert
Expert
Intermediate

Language

English
Advanced

Work Experience

Machine Learning Engineer at Metabob
August 1, 2022 - August 1, 2022
Built a three-stage hierarchical topic modeling workflow using BERTopic, increasing topic coherence from 2 to 10 meaningful topics. Resolved critical data-quality issues via a binary classifier, improving accuracy from 65% → 91%. Developed a parallelized ML experimentation pipeline on Docker + Kubernetes, reducing experimentation time by ~5×. Designed a Postgres schema for scalable experiment logging and analysis.
Machine Learning Engineer at DC Frontiers
June 1, 2023 - June 1, 2023
Trained a semi-supervised multilabel classification model achieving ~0.9 precision / ~0.8 recall across 100+ labels. Delivered results in half the scheduled timeline, accelerating downstream integration. Presented model results through interactive demos using Streamlit to drive stakeholder alignment.
Machine Learning Engineer at Ahrefs Pte Ltd
June 1, 2025 - June 1, 2025
Engineered and operated large-scale distributed data pipelines processing 1T+ tokens across 200 compute nodes for LLM training. Trained and fine‑tuned models using multi-GPU/multi-node FSDP and DeepSpeed, optimizing throughput and memory footprint. Built, deployed, and maintained an internal GitHub Copilot–style Code LLM assistant, achieving >60% developer adoption company-wide. Developed a high-throughput HTML boilerplate removal system in OCaml, leveraging AI-assisted tooling. Designed and maintained real-time production pipelines to support continuous ingestion and incremental model updates.
Machine Learning Engineer at Clarionn
August 1, 2025 - November 26, 2025
Architected an end-to-end GraphRAG platform including ingestion, KG extraction, ontology synthesis, and multi-hop reasoning pipelines. Designed an automatic ontology synthesis framework enabling dataset-specific schema generation with zero manual knowledge engineering. Achieved ~53% improvement over SOTA embedding retrieval on internal retrieval + multi-hop QA benchmarks. Built a multi-agent LLM workflow (Planner, Disambiguator, Navigator) enabling autonomous entity extraction, retrieval, and reasoning over large knowledge graphs.

Education

at University of California, Berkeley Extension
January 1, 2022 - January 1, 2022
Bachelor's Degree in Computer Engineering at Singapore University of Technology and Design
January 1, 2018 - January 1, 2021
Master of Science in Technology Entrepreneurship at Singapore University of Technology and Design
January 1, 2021 - January 1, 2022

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services