Hi, I'm Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations. I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.

Lennox Kuria

Hi, I'm Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations. I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.

Available to hire

Hi, I’m Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations.

I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Work Experience

Senior AI Training & Evaluation Specialist at Datacurve AI
December 1, 2025 - Present
Architect deterministic evaluation environments and generate high-fidelity RLHF datasets for frontier LLMs; evaluate 200+ AI outputs monthly across software engineering, finance, mathematics, and Web3; construct adversarial quests to expose weaknesses in repo-wide code generation and agentic workflows; conduct rigorous code QA and PR reviews for large-scale distributed systems; design evaluation rubrics for code correctness, security vulnerabilities, and maintainability across multi-file refactors and architectural migrations.
AI Systems Evaluator / RLHF Engineer at Stealth Labs
February 1, 2025 - August 1, 2025
Build deterministic Docker environments and Python-based AST verifiers to evaluate frontier AI models for production reliability; generate specialized training datasets focused on edge computing scenarios, adversarial prompt defense, and multi-agent system coordination; validate mathematical proofs using Lean framework, assessing 200+ AI-generated proofs across number theory, graph theory, and combinatorics; identify gaps between syntactically correct proofs and engineering problems.
Lead Architect & Developer at Aegis-Market
February 1, 2023 - March 1, 2026
Architected autonomous multi-agent arbitrage and negotiation engine using FastAPI for high-throughput backend orchestration; designed strict deterministic data validation pipelines using Pydantic, ensuring all agent inputs/outputs adhere to rigid JSON schemas before execution; developed Retrieval-Augmented Generation (RAG) pipeline to ground agent reasoning in verified internal data with strict cost-control mechanisms; enforced operational security by designing isolated, containerized execution environments preventing unverified agents from making autonomous state changes without human approval.
Senior Full Stack Software Engineer at Moovx
November 1, 2023 - January 1, 2025
Designed highly scalable web applications bridging modern reactive frontends (React, Vue.js) with secure, object-oriented backend APIs handling 100K+ daily requests; led architectural design for cloud migrations using strangler-fig patterns to safely transition legacy monoliths into microservices; built and maintained robust CI/CD pipelines with Terraform and Kubernetes, enforcing blue-green deployments; architected scalable stream processing and data orchestration pipelines (Apache Airflow, Golang) capable of securely processing 10M+ payloads daily.
Data Infrastructure & Systems Analyst at Freelance Tech Solutions
March 1, 2017 - December 1, 2020
Built and maintained complex distributed systems and backend workflows using Python, ensuring structural integrity for large-scale enterprise data integrations; developed full-stack features from database schema design to client-side UI; integrated automated testing frameworks and version control to reduce post-release hotfixes; created backend verifiers that autonomously monitored data quality and pipeline health.

Education

Bachelor of Science in Computer Science at University of California, Berkeley
January 11, 2030 - January 1, 2023

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Financial Services, Professional Services