Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

Hi, I'm Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations. I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.…Hi, I'm Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations. I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.

Lennox Kuria

AI Engineer, Data Scientist, Full Stack Developer, +2





Hi, I'm Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations. I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.…Hi, I'm Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations. I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.

Available to hire

Hi, I’m Lennox Kuria. I design adversarial test cases, construct deterministic evaluation rubrics, and perform surgical code audits for frontier LLM training pipelines. I thrive at the intersection of advanced data structures, high-concurrency Python architectures, and formal verification methods (Lean, TLA+), with a focus on breaking AI reasoning pathways through systematic edge-case construction and comprehensive repo-wide evaluations.

I’ve evaluated 1000+ AI outputs across software engineering, finance, mathematics, and Web3 domains, maintaining a 94% peer-review agreement on complex RLHF tasks. I enjoy building robust testing infrastructures, ensuring reproducibility and safety in production pipelines, and contributing to open-source tooling for AI data generation and verification.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Work Experience

Senior AI Training & Evaluation Specialist at Datacurve AI

December 1, 2025 - Present

Architect deterministic evaluation environments and generate high-fidelity RLHF datasets for frontier LLMs; evaluate 200+ AI outputs monthly across software engineering, finance, mathematics, and Web3; construct adversarial quests to expose weaknesses in repo-wide code generation and agentic workflows; conduct rigorous code QA and PR reviews for large-scale distributed systems; design evaluation rubrics for code correctness, security vulnerabilities, and maintainability across multi-file refactors and architectural migrations.

AI Systems Evaluator / RLHF Engineer at Stealth Labs

February 1, 2025 - August 1, 2025

Build deterministic Docker environments and Python-based AST verifiers to evaluate frontier AI models for production reliability; generate specialized training datasets focused on edge computing scenarios, adversarial prompt defense, and multi-agent system coordination; validate mathematical proofs using Lean framework, assessing 200+ AI-generated proofs across number theory, graph theory, and combinatorics; identify gaps between syntactically correct proofs and engineering problems.

Lead Architect & Developer at Aegis-Market

February 1, 2023 - March 1, 2026

Architected autonomous multi-agent arbitrage and negotiation engine using FastAPI for high-throughput backend orchestration; designed strict deterministic data validation pipelines using Pydantic, ensuring all agent inputs/outputs adhere to rigid JSON schemas before execution; developed Retrieval-Augmented Generation (RAG) pipeline to ground agent reasoning in verified internal data with strict cost-control mechanisms; enforced operational security by designing isolated, containerized execution environments preventing unverified agents from making autonomous state changes without human approval.

Senior Full Stack Software Engineer at Moovx

November 1, 2023 - January 1, 2025

Designed highly scalable web applications bridging modern reactive frontends (React, Vue.js) with secure, object-oriented backend APIs handling 100K+ daily requests; led architectural design for cloud migrations using strangler-fig patterns to safely transition legacy monoliths into microservices; built and maintained robust CI/CD pipelines with Terraform and Kubernetes, enforcing blue-green deployments; architected scalable stream processing and data orchestration pipelines (Apache Airflow, Golang) capable of securely processing 10M+ payloads daily.

Data Infrastructure & Systems Analyst at Freelance Tech Solutions

March 1, 2017 - December 1, 2020

Built and maintained complex distributed systems and backend workflows using Python, ensuring structural integrity for large-scale enterprise data integrations; developed full-stack features from database schema design to client-side UI; integrated automated testing frameworks and version control to reduce post-release hotfixes; created backend verifiers that autonomously monitored data quality and pipeline health.