Hi, I’m Hoang Ngo, a Senior Software Engineer specializing in Python, Java, and AI systems. Over the past decade I’ve built and scaled high-throughput distributed backends at Google, including the Gemini LLM serving stack and risk/fraud engines that process thousands of requests per second. I bridge research and production, focusing on high-performance Python, JVM tuning, multi-region cloud architectures, and resilient middleware. I enjoy turning complex problems into robust, measurable systems and mentoring teams to ship reliable software.

Hoang Ngo

Hi, I’m Hoang Ngo, a Senior Software Engineer specializing in Python, Java, and AI systems. Over the past decade I’ve built and scaled high-throughput distributed backends at Google, including the Gemini LLM serving stack and risk/fraud engines that process thousands of requests per second. I bridge research and production, focusing on high-performance Python, JVM tuning, multi-region cloud architectures, and resilient middleware. I enjoy turning complex problems into robust, measurable systems and mentoring teams to ship reliable software.

Available to hire

Hi, I’m Hoang Ngo, a Senior Software Engineer specializing in Python, Java, and AI systems. Over the past decade I’ve built and scaled high-throughput distributed backends at Google, including the Gemini LLM serving stack and risk/fraud engines that process thousands of requests per second.

I bridge research and production, focusing on high-performance Python, JVM tuning, multi-region cloud architectures, and resilient middleware. I enjoy turning complex problems into robust, measurable systems and mentoring teams to ship reliable software.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Software Engineer - Google Gemini Platform at Google
April 1, 2024 - Present
Led development of inference APIs for Google Gemini, improved time to first token by 30% through streaming and response pipelining, enhanced output consistency across 4 product surfaces, created experiment systems for prompt and model evaluation doubling experiment velocity, and optimized request batching and caching reducing compute cost by 15%.
Software Engineer - Google Pay Risk and Fraud Platform at Google
November 1, 2022 - March 1, 2024
Engineered backend services for transaction processing and risk evaluation; constructed fraud detection pipelines processing 3k+ QPS, reducing fraud losses by 8%; designed rule and signal systems for real-time risk scoring; improved system latency from 220ms to 160ms; implemented rollout controls and safeguards, lowering production issues by 30%.
Software Engineer - Google Health ML Platform at Google
August 1, 2020 - November 1, 2022
Built React and TypeScript tools for healthcare ML systems aligned with Google Health; integrated NLP models for medical text analysis, cutting manual review effort by 25%; developed dashboards for model evaluation and data validation, reducing analysis time by 40%; enhanced frontend performance for large datasets, lowering render latency by 35%; collaborated with backend teams to productionize ML features.
Software Architect at BrightInsight, a Flex Company
January 1, 2019 - February 1, 2020
Directed architecture and development of backend services for HIPAA-compliant healthcare platforms, delivering core systems from 0 to production; built integration layers for third-party systems, shortening onboarding time by 25%; established cloud infrastructure and CI/CD pipelines to support scalable deployments; defined coding standards and design guidelines for a small engineering team; coordinated with cross-functional teams to resolve production issues.
Software Engineer at FPT Software
June 1, 2014 - September 1, 2018
Delivered full-stack applications for enterprise clients across finance and retail using TypeScript, Java, Python, and React; developed backend services and APIs for transaction processing and data workflows, improving request latency by 20%; designed and optimized database schemas in MySQL and PostgreSQL, improving query performance up to 30%; implemented responsive frontend features for client-facing applications, reducing page load time by 25%; engineered authentication, data pipelines, and third-party integrations; contributed to CI/CD pipelines using Docker and automation tools, cutting deployment time by 40%; collaborated with distributed teams to deliver features on schedule, improving sprint predictability and reducing rework.
Software Engineer at Google
April 1, 2023 - Present
Architected the Python-based serving infrastructure and orchestration layers for Gemini LLMs, implementing asynchronous streaming with asyncio to reduce TTFT by 30%. Developed high-reliability Java (gRPC) sidecars to integrate Python inference services with authentication, quota management, and audit logging. Optimized TPU v5e throughput with dynamic request batching and multi-tier caching for embeddings and prompts. Built an automated LLM evaluation framework with model-based grading and A/B testing to accelerate ML research. Engineered a fault-tolerant API layer with admission control and structured logging, reducing customer-facing errors during 10x traffic spikes.
Software Engineer at Google Pay Risk and Fraud Platform (Internal Analytics)
November 1, 2020 - March 31, 2023
Refactored core payment backend logic using Java and Guice to reduce P99 transaction latency from 220ms to 160ms. Constructed Python-based fraud-detection pipelines handling 3k+ QPS with P99 latency <160ms, contributing to a measurable fraud-loss reduction. Designed a versioned, zero-downtime rule engine enabling atomic module swaps. Built immutable event-sourcing infrastructure using memory-efficient dataclasses and Protobufs, with Bigtable for state and BigQuery for tamper-evident logs. Automated incident response with canary analysis using Mann-Whitney U tests, reducing MTTR by 40%.
Software Architect at Bright Insight
January 1, 2019 - February 29, 2020
Architected a multi-tenant, HIPAA-compliant backend using Python (FastAPI) and SQLAlchemy with row-level PHI segmentation to ensure data isolation. Implemented Infrastructure-as-Code with Terraform for AWS ECS (Fargate) and RDS Multi-AZ, automating partner environment provisioning and reducing onboarding time by 25%. Designed secure IoT data ingestion pipelines using MQTT over TLS and Python asyncio consumers, enabling high-frequency telemetry with zero data loss. Established engineering standards for a Python-centric team with static analysis and >90% test coverage using pytest.

Education

Bachelor of Science at University of Greenwich
September 1, 2011 - June 1, 2014
Bachelor's degree in Computer Science at University of Greenwich
January 1, 2011 - January 1, 2014

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Healthcare, Financial Services, Professional Services, Education