I am a Senior AI Engineer and Data Scientist with over six years of experience building and deploying production-grade artificial intelligence systems. Currently, I serve as a Senior Data Scientist at reallytics.ai, where I architect enterprise-grade RAG pipelines and Agentic AI frameworks using LangChain, FAISS, and ChromaDB. My career is defined by a "bias for action," moving rapidly from high-level architecture to scalable, cloud-native implementations that drive measurable business impact.

SAMI KHAN

I am a Senior AI Engineer and Data Scientist with over six years of experience building and deploying production-grade artificial intelligence systems. Currently, I serve as a Senior Data Scientist at reallytics.ai, where I architect enterprise-grade RAG pipelines and Agentic AI frameworks using LangChain, FAISS, and ChromaDB. My career is defined by a "bias for action," moving rapidly from high-level architecture to scalable, cloud-native implementations that drive measurable business impact.

Available to hire

I am a Senior AI Engineer and Data Scientist with over six years of experience building and deploying production-grade artificial intelligence systems. Currently, I serve as a Senior Data Scientist at reallytics.ai, where I architect enterprise-grade RAG pipelines and Agentic AI frameworks using LangChain, FAISS, and ChromaDB. My career is defined by a “bias for action,” moving rapidly from high-level architecture to scalable, cloud-native implementations that drive measurable business impact.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Intermediate

Work Experience

Senior Data Scientist at REALLYTICS.AI
September 1, 2024 - Present
Designing and deploying production RAG pipelines using LangChain, FAISS, and ChromaDB for context-aware enterprise knowledge retrieval across structured and unstructured data. Building GenAI-powered automation systems using OpenAI, Claude, and LLaMA APIs, integrated into client workflows for intelligent document processing and decision support. Fine-tuning open-source LLMs (LLaMA-2, Mistral) using LoRA, QLoRA, and PEFT via Hugging Face Transformers for domain-specific client applications. Architecting cloud-native ML infrastructure on AWS: SageMaker for training, Lambda for inference, S3 for data lakes, ECS/ECR for containerized model serving. Developing predictive modeling solutions using causal inference, Bayesian statistics, and neural networks for client forecasting and optimization use cases. Serving fine-tuned models at scale using VLLM with optimized GPU utilization via CUDA.
Lead AI & ML Engineer at THE GENIUS GROUP (TGG)
April 1, 2021 - September 30, 2022
Spearheaded development of a Generative AI Assistant based on LLaMA-2 7B, fine-tuned with custom data using LoRA and PEFT, resulting in 45% reduction in customer support ticket volume. Designed and deployed invoice parsing system integrating OCR, NLP, and image processing, reducing manual data entry time by 50%. Led AI model optimization using quantization, LoRA, and PEFT, significantly improving inference speed and reducing deployment costs. Built end-to-end ML pipelines on AWS and Azure: data collection, preprocessing, model training, and production deployment. Developed cloud-native services with AWS EC2, S3, RDS, and Lambda for real-time ML model inference and data processing workflows. Integrated ML models into customer-facing applications, improving response time and user satisfaction by 35%. Automated NLP tasks (tokenization, lemmatization, sentiment analysis) at scale across multiple platforms. Collaborated with data engineers and product managers to define AI-driven product ro
Data Scientist at IBM (VIA VERTICITI)
April 1, 2020 - February 28, 2021
Developed a recommendation system using BERT embeddings and cosine similarity, improving data matching accuracy by 50% and automating manual processes. Engineered CNN-based wildlife detection system using thermal images and TensorFlow for real-time invasive species identification. Utilized PySpark for large-scale data processing, optimizing workflows in big data environments. Built ML models for fraud detection, customer segmentation, and predictive maintenance, improving decision-making accuracy across business units. Implemented multilingual NLP solutions using PyTorch and spaCy, reducing translation costs by 30%. Designed and deployed NLP chatbot automating customer service inquiries, reducing human labor by 40%. Led migration of cloud-based translation system to on-premise, improving data privacy and cost-efficiency by 30%. Optimized data pipelines integrating SQL and NoSQL sources, reducing data latency by 25%.
Machine Learning Engineer at DATAONMATRIX
April 1, 2020 - February 28, 2021
Led implementation of AI-powered chatbot using BERT Transformers, improving user interaction by 40%. Built a recommendation engine using cosine similarity and BERT embeddings, reducing manual effort by 50%. Developed and optimized data pipelines for model training and real-time deployment on AWS and Azure. Applied PySpark and scikit-learn to analyze large datasets for predictive analytics and classification. Integrated cloud-based solutions (AWS, Azure) for scalable model deployment and real-time inference.

Education

Data Mining, Artificial Intelligence at University of Illinois Urbana-Champaign (UIUC)
January 1, 2023 - December 31, 2024
Bachelor of Science, Computer Science at COMSATS Institute of Information Technology
January 1, 2023 - December 31, 2026

Qualifications

IBM Machine Learning Specialization
January 11, 2030 - April 2, 2026
HCIA-Cloud Computing
January 11, 2030 - April 2, 2026
AWS Cloud Solution Architect
January 11, 2030 - April 2, 2026

Industry Experience

Software & Internet, Professional Services, Computers & Electronics, Education