Hi, I’m Jeffrey Li. I’m a Senior AI Product Engineer focused on Generative AI, LLMs, and end-to-end AI platforms that scale for enterprises. I design and ship production-grade systems—from semantic search and RAG to agentic workflows and real-time inference—delivering measurable improvements in accuracy, automation, and safety for millions of users and multi-million-dollar pilots. I excel at turning ambiguous business problems into scalable ML and software architectures, collaborating across product, research, and customer teams to drive impactful AI features in healthcare, finance, and conversational AI domains. I enjoy bridging technical depth with strategic product thinking to deliver reliable, measurable outcomes with a strong emphasis on governance, safety, and enterprise readiness.

Jeffrey Li

Hi, I’m Jeffrey Li. I’m a Senior AI Product Engineer focused on Generative AI, LLMs, and end-to-end AI platforms that scale for enterprises. I design and ship production-grade systems—from semantic search and RAG to agentic workflows and real-time inference—delivering measurable improvements in accuracy, automation, and safety for millions of users and multi-million-dollar pilots. I excel at turning ambiguous business problems into scalable ML and software architectures, collaborating across product, research, and customer teams to drive impactful AI features in healthcare, finance, and conversational AI domains. I enjoy bridging technical depth with strategic product thinking to deliver reliable, measurable outcomes with a strong emphasis on governance, safety, and enterprise readiness.

Available to hire

Hi, I’m Jeffrey Li. I’m a Senior AI Product Engineer focused on Generative AI, LLMs, and end-to-end AI platforms that scale for enterprises. I design and ship production-grade systems—from semantic search and RAG to agentic workflows and real-time inference—delivering measurable improvements in accuracy, automation, and safety for millions of users and multi-million-dollar pilots.

I excel at turning ambiguous business problems into scalable ML and software architectures, collaborating across product, research, and customer teams to drive impactful AI features in healthcare, finance, and conversational AI domains. I enjoy bridging technical depth with strategic product thinking to deliver reliable, measurable outcomes with a strong emphasis on governance, safety, and enterprise readiness.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Generative AI Engineer at Cohere
January 1, 2023 - January 1, 2025
Delivered production LLM systems, AI agents, fine-tuning pipelines, and enterprise AI integrations that elevated model accuracy, safety, and automation. Designed and deployed enterprise-grade RAG and semantic search solutions with dense + sparse retrieval, improving retrieval precision by 28–35% across Fortune 100 datasets. Built LLM-powered agentic workflows (tool-calling, multi-step orchestration, domain memory), reducing manual operational effort by 40–60% and increasing automation reliability. Fine-tuned domain-specific LLMs on proprietary enterprise data (healthcare, finance) to boost task accuracy by 20–30% and reduce hallucinations in high-stakes use cases. Engineered scalable inference pipelines (Triton, Kubernetes, vector DBs, async batching) lowering latency ~35% and reducing inference costs ~25% under peak load. Implemented safety tooling (toxicity filters, model gating, factuality scoring) improving safety metrics by 15% and reducing red-flag responses by ~40%. Implem
Senior Machine Learning Engineer at LivePerson
January 1, 2020 - January 1, 2023
Developed NLP and GenAI models, real-time ML systems, and customer-facing AI features powering hundreds of millions of interactions monthly. Led development of next-gen conversational models (BERT/Transformer-based) for intent detection and dialog routing, increasing automated resolution rates by ~27% across enterprise clients. Built LLM-powered response-generation and summarization pipelines, reducing agent handle time by ~22% and improving CSAT by 10–12%. Delivered real-time inference services (Python, FastAPI, Redis, Kubernetes) supporting 100M+ monthly interactions with sub-120ms latency and seamless AI-human handoff. Architected ranking and relevance models for message classification and proactive engagement, boosting user-initiated conversions by ~18%. Implemented labeling and active-learning loops, increasing annotation efficiency by 3× and improving long-tail intent F1 by ~15%. Built offline/online training pipelines (Airflow, Spark, MLflow) that cut experiment cycle times a
Data Engineer, AI/ML Platform at Capital One
January 1, 2018 - January 1, 2020
Built end-to-end ML data pipelines (Spark, Airflow, Scala/Python) processing 5–10B credit card transactions/day, reducing feature latency from hours to near real-time for fraud and risk models. Partnered with data scientists to productionize ML models for credit risk scoring and customer segmentation, improving model lift by 8–12% through feature engineering and hyperparameter optimization. Led development of a centralized feature store for 50+ production ML models, reducing redundant compute by ~40% and standardizing feature governance. Built streaming ML features with Kafka + Flink enabling real-time fraud detection with sub-80ms latency and ~15% reduction in false positives. Implemented automated model monitoring for data/drift, decreasing degradation incidents by ~30% YoY. Developed secure ML data pipelines aligned with compliance (PII obfuscation, tokenization, lineage tracking). Implemented distributed training on AWS Sagemaker + EMR, cutting training times for large-scale mo
Software Engineer at Flatiron Health
January 1, 2015 - January 1, 2018
Engineered backend microservices for oncology EHR data ingestion and normalization, improving structured data coverage by ~35%. Built high-throughput ETL pipelines (Python, Spark, Airflow) processing 10M+ patient-document records/month, reducing data latency for RWE datasets from 24 hours to under 6 hours. Developed a clinical feature extraction service leveraging classical ML (logistic regression + gradient boosting) to classify tumor attributes, increasing extraction accuracy by ~18%. Built RESTful APIs powering real-time clinical workflows for 2,000+ providers with sub-150ms response times. Partnered with oncology data scientists to productionize prognostic/risk models, enabling higher-precision cohort identification. Implemented data quality detection algorithms reducing manual abstraction errors by ~22%, and integrated Spark NLP pipelines to accelerate unstructured-note processing, lowering costs. Collaborated cross-functionally to launch EHR workflow features increasing OncoEMR a

Education

Master of Science in Computer Science at Massachusetts Institute of Technology
January 1, 2013 - January 1, 2015
Bachelor of Science in Electrical Engineering and Computer Science (EECS) at Massachusetts Institute of Technology
January 1, 2009 - January 1, 2013

Qualifications

Add your qualifications or awards here.

Industry Experience

Healthcare, Financial Services, Software & Internet, Professional Services, Media & Entertainment