As a senior AI/ML engineer, I bring a decade of experience designing and deploying intelligent systems across finance, telecom, healthcare, and education. I specialize in decision-focused AI solutions, including retrieval-augmented generation (RAG) pipelines, LLM integration, and multi-agent systems. I thrive on building scalable architectures from data ingestion to cloud deployment (AWS, GCP), and I’ve consistently reduced research time, improved governance, and accelerated engineering workflows. I aim to align cutting-edge AI models with real-world business needs, ensuring security, accuracy, and measurable impact. I collaborate closely with stakeholders to deliver robust, production-grade systems and tackle complex data challenges—from secure on-prem/offline deployments to monitoring and cost optimization—always prioritizing practicality, ethics, and governance.

Joseph Lord

As a senior AI/ML engineer, I bring a decade of experience designing and deploying intelligent systems across finance, telecom, healthcare, and education. I specialize in decision-focused AI solutions, including retrieval-augmented generation (RAG) pipelines, LLM integration, and multi-agent systems. I thrive on building scalable architectures from data ingestion to cloud deployment (AWS, GCP), and I’ve consistently reduced research time, improved governance, and accelerated engineering workflows. I aim to align cutting-edge AI models with real-world business needs, ensuring security, accuracy, and measurable impact. I collaborate closely with stakeholders to deliver robust, production-grade systems and tackle complex data challenges—from secure on-prem/offline deployments to monitoring and cost optimization—always prioritizing practicality, ethics, and governance.

Available to hire

As a senior AI/ML engineer, I bring a decade of experience designing and deploying intelligent systems across finance, telecom, healthcare, and education. I specialize in decision-focused AI solutions, including retrieval-augmented generation (RAG) pipelines, LLM integration, and multi-agent systems. I thrive on building scalable architectures from data ingestion to cloud deployment (AWS, GCP), and I’ve consistently reduced research time, improved governance, and accelerated engineering workflows.

I aim to align cutting-edge AI models with real-world business needs, ensuring security, accuracy, and measurable impact. I collaborate closely with stakeholders to deliver robust, production-grade systems and tackle complex data challenges—from secure on-prem/offline deployments to monitoring and cost optimization—always prioritizing practicality, ethics, and governance.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

Senior AI/ML Engineer at Contextual AI
February 1, 2024 - Present
Led the design and deployment of a fully on-prem, air-gapped RAG chatbot for Qualcomm using Llama-3-70B-Instruct served via vLLM on NVIDIA A100 GPUs for high-throughput, low-latency inference. Built offline document ingestion (PDF/HTML) with PyMuPDF and BeautifulSoup, tuned semantic chunking for secure processing of proprietary data. Generated embeddings and exported to ONNX, optimized with TensorRT, and indexed in Weaviate (HNSW and BM25) for hybrid retrieval. Added cross-encoder reranking (TensorRT-optimized) to improve retrieval precision before LLM synthesis. Orchestrated retrieval, reranking, prompt assembly, and citation injection using LangChain, with strict context-window controls to prevent data leakage. Implemented multi-turn conversation state using Redis, enabling contextual follow-ups in an internal-only environment. Deployed and operated the system with Docker and Kubernetes in an air-gapped cluster, with Prometheus and Grafana monitoring for GPU, latency, and throughput.
Senior Machine Learning Engineer at Western Governors University
April 1, 2022 - February 1, 2024
CoachDesk - Multi-Agent AI System Prototype: Set up AI agents for tutor, grader, and coach roles using GPT-4 APIs, LangChain, and Python, enabling automated educational tasks. Created mock student data and small databases with pandas and SQLite to test personalization, leading to more tailored interactions. Wrote prompts and tested conversation flows in Jupyter notebooks to ensure seamless collaboration among agents. Built simple inter-agent connections with FastAPI and Redis to enhance system integration. Packaged prototypes with Docker and shared demos via GitHub Actions and AWS Lambda. Tracked performance with Prometheus and Grafana, logging results to identify AI performance issues and improve reliability.
Machine Learning Engineer at Kasisto
September 1, 2020 - March 1, 2022
AI-Powered Hackathon Assistant: Built a retrieval-augmented chatbot using embeddings and GPT-3 to provide real-time suggestions for project descriptions, titles, and tags. Implemented FastAPI to connect the chatbot with a retrieval system for fetching hackathon FAQs and guidelines, improving query relevance. Integrated sentence embeddings for similarity search to retrieve the most relevant content, ensuring responses were specific and contextually accurate. Leveraged Hugging Face Transformers to fine-tune LLMs for generating hackathon announcements, FAQs, and project summaries. Used MongoDB to store user profiles, chatbot interaction logs, and submission data for real-time retrieval and personalization. Deployed the chatbot and RAG systems using AWS Lambda for serverless execution, containerized with Docker for AWS ECS, enabling thousands of concurrent users. Monitored RAG performance with Elasticsearch, tracking latency, retrieval accuracy, and response coherence; optimized MongoDB qu
Machine Learning Engineer at NarrativeDx
July 1, 2018 - August 1, 2020
Designed and deployed AI-driven chatbots using Microsoft Bot Framework to automate customer interactions and support workflows. Built NLP pipelines for intent detection, entity recognition, and sentiment analysis with spaCy and NLTK. Enhanced language understanding with Word2Vec and BERT, and added multilingual support via Google Translate API. Built RESTful APIs using Django to handle chatbot requests and backend communication, and integrated Django with PostgreSQL using SQLAlchemy. Deployed the Django backend on Google Kubernetes Engine (GKE) for scalability, and integrated with Amazon S3 for file storage during chatbot interactions. Implemented Transformer models using PyTorch to improve accuracy and efficiency.
Computer Vision Engineer at BairesDev
January 1, 2015 - June 1, 2018
Designed ML models for disease prediction and medical diagnostics, developing CNNs with transfer learning (AlexNet, VGG16) to improve diagnostic accuracy in pilot tests. Built NLP pipelines with spaCy, Word2Vec, and NLTK to analyze clinical notes and extract structured medical entities. Implemented Faster R-CNN for object detection in medical images, enhancing early detection capabilities.
Machine Learning Intern at Pinewood Analytics
January 1, 2014 - December 1, 2014
Developed image classification and object detection models using MATLAB, ImageJ, and CNN frameworks. Preprocessed and annotated medical imaging datasets with Python tools and optimized model performance using PCA and hyperparameter tuning. Built and deployed a prototype system that improved biomarker detection accuracy, reducing false negatives in medical imaging.

Education

M.S. in Computer Science (AI/ML focus) at Rice University
January 11, 2030 - January 1, 2014
B.S. in Computer Science at Rice University
January 11, 2030 - January 1, 2012

Qualifications

AWS Certified Cloud Practitioner
January 11, 2030 - January 5, 2026
Cisco Certified Cyber Ops Associate
January 11, 2030 - January 5, 2026

Industry Experience

Telecommunications, Financial Services, Healthcare, Education, Software & Internet