An AI/ML researcher with proved track record of original research, invention on many frontiers of applying ML/AI to the technology industry. Here is a list of past projects (where I am the only/main/lead IC, and the author of the break-through idea):
Discerning ensemble technique, to addressing dynamic matching of base models in an ensemble from different sources. This invention addressed several critical issues in applying ML to unseen data where re-collect and re-train is infeasible: how to find the root cause of bad model performance, how to improve model matching dynamically on different types of test data, etc.
Dynamic real-time root cause analysis based on probability propagation on graph and Bayes’ rule for Synopsys Verdi debugger. Introduced the probability propagation idea (similar to page rank) to fault probability calculations on circuit graph. Contributed this Patent to Synopsys, from initial idea to coding and demo on Verdi product (worked with the developer on Verdi).
Graph and hyperGraph partition using pyTorch optimization engine. I reformulated traditional partition problem from a combinatorial problem to an optimization problem. My results on public benchmarks directly proved that hMetics (state of the art industry standard) can be further optimized on multiple Pareto fronts, including topology cost, balance cost, cut cost etc.
Verilog code snippet embedding from very small data set [2021 before LLM is mainstream]. Verilog code is extremely hard to find. So, establishing the most efficient model is critical. I researched and evaluated many techniques in the research literature and settled with embedding on syntax trees from parser. I worked out the complete flow within a few weeks: from Verilog parser, syntax tree pruning, tree nodes embedding model, model training and tuning, data visualization, code matching tests, etc.
Voice recognition from high resolution waveform to encoded low bit multi-channel “images” as input to an ASIC AI chip. Achieved model compression, 32-bit floating point to 5-bit, and high accuracy. I contributed several patents on voice waveform data encoding for low bit integer inputs of CNN models for the startup I worked for.
Coded Delaunay triangulation implementation in C++, build a graph rendering library based on OpenGL, etc.
Coded C++ template library for basic array, managed pointer, etc.
Worked on PSTD for partial differential equations. Invented soft source generation, numeric dispersion compensation, methods to tackle Gibbs phenomenon, etc.
Many (20+) successful ML projects delivered for the technology industry in the past 12 years.
Work Experience
Education
Qualifications
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Palo Alto today.