I'm Ahmad Hayes, a Senior Machine Learning Engineer specializing in Generative AI, LLMs, and NLP. I design and deploy scalable AI solutions for enterprise use, building end-to-end ML pipelines, RAG workflows, and vector-based retrieval systems. With 10+ years of experience across data engineering, model development, and production-grade deployments on AWS and GCP, I focus on performance profiling, GPU optimization, and reliable, cloud-native inference using fast APIs and container orchestration. I enjoy turning research into production-ready GenAI applications that solve real-world problems.

Ahmad Hayes

I'm Ahmad Hayes, a Senior Machine Learning Engineer specializing in Generative AI, LLMs, and NLP. I design and deploy scalable AI solutions for enterprise use, building end-to-end ML pipelines, RAG workflows, and vector-based retrieval systems. With 10+ years of experience across data engineering, model development, and production-grade deployments on AWS and GCP, I focus on performance profiling, GPU optimization, and reliable, cloud-native inference using fast APIs and container orchestration. I enjoy turning research into production-ready GenAI applications that solve real-world problems.

Available to hire

I’m Ahmad Hayes, a Senior Machine Learning Engineer specializing in Generative AI, LLMs, and NLP. I design and deploy scalable AI solutions for enterprise use, building end-to-end ML pipelines, RAG workflows, and vector-based retrieval systems.

With 10+ years of experience across data engineering, model development, and production-grade deployments on AWS and GCP, I focus on performance profiling, GPU optimization, and reliable, cloud-native inference using fast APIs and container orchestration. I enjoy turning research into production-ready GenAI applications that solve real-world problems.

See more

Work Experience

Senior Machine Learning Engineer at Klarity Labs
January 1, 2021 - Present
Designed and deployed end-to-end Generative AI systems using transformer-based LLMs for enterprise NLP use cases including conversational AI, summarization, and semantic search; Built and scaled Retrieval-Augmented Generation pipelines with FAISS and Pinecone for real-time contextual search; Led development of intelligent document understanding with entity extraction and OCR pipelines; Built production-grade ML APIs via FastAPI and containerized services with Docker and Kubernetes; Created automated pipelines for prompt engineering, model evaluation, and multi-run experiment tracking using MLflow and Weights & Biases; Explored LoRA/PEFT for parameter-efficient fine-tuning; Integrated LangChain for modular LLM apps; Established internal A/B testing and latency benchmarking; Used Kineto trace and Torch dispatcher to profile kernels and optimize GPU runtime; Collaborated with MLOps to deploy on AWS/GCP.
Senior Machine Learning Engineer at CognitiveScale
December 1, 2020 - October 24, 2025
Led multilingual NLP systems for sentiment analysis, text classification, and information extraction using BERT and XLNet; Deployed models via REST APIs using Flask and TensorFlow Serving for real-time predictions; Engineered pipelines for time-series forecasting and anomaly detection to monitor KPIs; Built transfer learning workflows to reduce data requirements and improve cross-domain generalization; Implemented automated ETL and model orchestration with Apache Airflow and Docker; Designed modular ML pipelines with feature extraction, training, hyperparameter tuning, and drift-based retraining; Partnered with product and engineering teams to define problems, metrics, and evaluation plans; Implemented data labeling and active learning to boost annotation throughput; Promoted ML best practices and reproducible pipelines.
Machine Learning Engineer at Mavericks United
June 1, 2019 - October 24, 2025
Led multilingual NLP for sentiment analysis, text classification, and information extraction using modern models; Integrated pre-trained models like BERT and XLNet into customer-facing applications; Engineered pipelines for time-series forecasting and anomaly detection to predict operational KPIs; Developed custom transfer learning workflows to reduce training data requirements and boost generalization across domains; Deployed models via REST APIs using Flask and TensorFlow Serving for real-time predictions; Created automated ETL workflows and model orchestration pipelines using Apache Airflow and Docker for reproducibility and scalability; Designed modular ML pipelines with feature extraction, training, hyperparameter tuning, and automated retraining based on drift detection; Partnered with product and engineering teams to define ML problem statements, success metrics, and iterative experiment cycles; Implemented data labeling and active learning strategies to optimize dataset quality

Education

Master of Science in Computer Science at Preston University
January 11, 2030 - October 24, 2025

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services