I am Yuxiang Huang, a researcher and machine learning engineer specializing in computer vision, motion segmentation, and AI-powered media solutions. I enjoy turning complex research into production-ready pipelines and bridging academia with real-world applications. Currently I work at RAD AI, developing production-grade image synthesis pipelines and moderation frameworks, and collaborating with cross-functional teams to optimize performance and cost.

Yuxiang Huang

I am Yuxiang Huang, a researcher and machine learning engineer specializing in computer vision, motion segmentation, and AI-powered media solutions. I enjoy turning complex research into production-ready pipelines and bridging academia with real-world applications. Currently I work at RAD AI, developing production-grade image synthesis pipelines and moderation frameworks, and collaborating with cross-functional teams to optimize performance and cost.

Available to hire

I am Yuxiang Huang, a researcher and machine learning engineer specializing in computer vision, motion segmentation, and AI-powered media solutions. I enjoy turning complex research into production-ready pipelines and bridging academia with real-world applications.

Currently I work at RAD AI, developing production-grade image synthesis pipelines and moderation frameworks, and collaborating with cross-functional teams to optimize performance and cost.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Machine Learning Engineer at RAD AI
December 1, 2023 - Present
Led the development and deployment of a production-grade image generation pipeline using Diffusion Transformers (DiT) and HiDream; implemented a hybrid LoRA fine-tuning strategy via PEFT to mitigate facial identity drift and ensure text rendering accuracy in advertising environments; built a persona-based multimedia analysis system leveraging Retrieval-Augmented Multimodal LLMs (RAG with Qwen3-Omni) and cloud storage; deployed on AWS EC2 and Modal cloud; optimized inference speed ~10x; developed content moderation framework with YOLO and LLMs; created an automated influencer image enhancement pipeline.
Graduate Research Assistant at University of Waterloo
September 1, 2020 - December 1, 2023
Researched monocular motion segmentation in dynamic scenes; proposed a zero-shot method combining object recognition/tracking with geometry-based cues (optical flow, depth, epipolar geometry); contributed to weakly-supervised image segmentation and improved a relaxed grid CRF loss, achieving gains on standard datasets.
Graduate Teaching Assistant at University of Waterloo
September 1, 2020 - December 1, 2023
TA for machine learning and vision courses (4 semesters); assisted 100+ students with debugging Python assignments; prepared assignment solutions and graded assignments; led tutorials.
Research Intern at Nippon Telegraph and Telephone (NTT)
May 1, 2019 - December 1, 2019
Explored cross-modal deep learning for audio-based 2D/3D semantic scene understanding; designed CNN-based encoder-decoder to generate semantic segmentation from audio; extracted MFCC and angular spectrum from microphone arrays; developed conditional adversarial networks to boost audio recognition (~8% MioU); reconstructed 3D models from 2D segmentation masks via visual hull.
Research Assistant at SPIN Lab, University of British Columbia
February 1, 2018 - May 1, 2019
Researched free-roam force-feedback pen for designers/artists; built Python/PyQt UI; implemented multi-threaded server-client for two-way WiFi between sensors and PC; co-authored and published in CHI 2020.
Research Intern at Shared Reality Lab, McGill University
May 1, 2018 - August 1, 2018
Developed Android app for non-intrusive experience sampling; implemented fingerprint gestures and notifications; used NoSQL to log activities; analyzed experiment data with statistics; co-authored MobileHCI 2019.
Software Developer (Volunteer) at UBC Rover
February 1, 2017 - December 1, 2017
Built a real-time multi-camera video stitching program for robot vision; implemented in C++/ROS; tested in Gazebo and real environments.
Java Web Developer Intern at Sendinfo Technology
July 1, 2017 - August 1, 2017
Developed an integrated marketing management web app using Java, SQL, and SSM frameworks; implemented data access objects and SQL queries; managed project with Maven and SVN.
Research Intern at Nippon Telegraph and Telephone (NTT) Communications Science Laboratory
May 1, 2019 - December 1, 2019
Explored cross-modal deep learning for audio-based 2D/3D semantic scene understanding. Developed a CNN-based encoder-decoder to generate semantic segmentation from audio, using MFCC and angular spectrum as inputs. Trained conditional adversarial networks with synthetic data to improve audio recognition and reconstructed 3D models from 2D segmentation masks via visual hull techniques.

Education

MASc in Systems Design Engineering (Thesis) at University of Waterloo
January 1, 2020 - January 1, 2023
BASc in Computer Engineering with Minor in Statistics, Degree with Distinction at University of British Columbia
January 1, 2015 - January 1, 2020
MASc in Systems Design Engineering (Thesis) at University of Waterloo
January 1, 2020 - January 1, 2023
BASc in Computer Engineering with Minor in Statistics, Degree with Distinction at University of British Columbia
January 1, 2015 - January 1, 2020

Qualifications

Graduate Research Studentship
January 1, 2020 - December 31, 2023
International Master's Award of Excellence
January 1, 2020 - December 31, 2022
University of Waterloo Entrance Scholarship
January 1, 2020 - December 31, 2020
MDDC Award of Distinction
January 1, 2019 - December 31, 2019
APSC Outstanding Project Award
January 1, 2019 - December 31, 2019

Industry Experience

Software & Internet, Media & Entertainment, Education, Professional Services, Healthcare