Hi, I’m Firas Tlili Gafsa, an AI and machine learning engineer based in Gabes, Tunisia.
With 7+ years of experience developing cutting-edge deep learning models, ML pipelines, and end-to-end AI solutions across medical imaging, agriculture, and multimedia, I bring hands-on expertise in Python, PyTorch, TensorFlow, OpenCV, and cloud platforms (AWS, Google Cloud, Azure).
I thrive in cross-functional teams and hybrid academic-industry settings, delivering scalable, production-ready systems using Docker, Kubernetes, MLflow, and DVC. My work spans computer vision, natural language processing, and generative AI, and I enjoy turning complex problems into tangible impact—from automating medical image analysis to enabling real-time video analytics for sports and security.
Experience Level
Language
Work Experience
Education
Qualifications
Industry Experience
Project Description
End-to-End Kidney Disease Classification with MLflow, DVC, and Cloud Deployment is a fully productionized machine learning pipeline designed to deliver accurate and scalable predictions for kidney disease diagnosis.
The project implements a complete ML lifecycle, starting from data ingestion and preprocessing to model training, evaluation, and deployment. Using Scikit-learn and Pandas, the system processes clinical data to build a reliable classification model capable of identifying kidney disease patterns with high consistency.
A key focus of the project is reproducibility and experiment management. MLflow is used to track experiments, log model parameters, and compare performance metrics, enabling systematic model optimization. DVC (Data Version Control) ensures proper versioning of datasets and pipeline stages, allowing seamless collaboration and reproducible workflows across different environments.
The deployment pipeline is fully automated using GitHub Actions, enabling continuous integration and delivery (CI/CD). The application is containerized with Docker and deployed on AWS infrastructure, specifically EC2 for compute and ECR for container registry management, ensuring scalability and reliability in production.
A lightweight Flask API serves the trained model, allowing real-time inference through REST endpoints. This makes the system easily accessible for integration into external applications or healthcare platforms.
Security and infrastructure management are handled באמצעות AWS IAM, ensuring controlled access and safe deployment practices. The modular design of the pipeline allows easy updates, retraining, and scaling as new data becomes available.
Overall, this project demonstrates how modern MLOps tools and cloud technologies can be combined to build a robust, reproducible, and scalable machine learning system for healthcare applications.
Project Description
Multimodal AI Agent for Enhanced Content Understanding is an advanced Retrieval-Augmented Generation (RAG) system designed to enable intelligent interaction with both textual and visual data. The project integrates Large Language Models (LLMs) and Vision-Language Models (VLMs) to provide a unified solution for understanding complex, multimodal content.
The system leverages LlamaIndex to orchestrate data ingestion, indexing, and retrieval across multiple document types, including PDFs, PowerPoint presentations, and images. NVIDIA NIM microservices are used to efficiently serve optimized AI models, ensuring high-performance inference for both text and visual tasks.
At its core, the platform uses Milvus as a high-performance vector database to store and retrieve embeddings for semantic search. This enables real-time querying of large-scale multimodal datasets, where user inputs are matched with the most relevant contextual information before being processed by the underlying models.
A key feature of the system is its ability to interpret and analyze visual data alongside text. By incorporating models like DePlot, the agent can extract structured insights from images such as charts, diagrams, and visual documents, enhancing the depth and accuracy of responses.
The user interface is built with Streamlit, providing an interactive chat-based experience where users can upload documents or images and ask natural language questions. The system responds with context-aware answers that combine textual understanding and visual reasoning.
Designed with modularity and scalability in mind, the architecture supports seamless integration of additional models and data sources. This makes it adaptable for various use cases, including document analysis, business intelligence, research assistance, and knowledge management.
Overall, this project demonstrates the power of multimodal AI systems in bridging the gap between text and visual information, delivering a more comprehensive and intuitive approach to content understanding.
Project Description
End-to-End Medical Chatbot with LLMs, LangChain, Pinecone, and LLMOps is a production-ready intelligent assistant designed to deliver accurate, context-aware medical information through a scalable and secure architecture.
The system is built around a Retrieval-Augmented Generation (RAG) pipeline, combining Large Language Models (LLMs) with a vector database to ensure responses are grounded in reliable medical data. User queries are first processed and enriched using LangChain, then matched against indexed medical knowledge stored in Pinecone. Relevant context is retrieved and passed to the LLM to generate precise and informative answers.
A strong emphasis is placed on clinical relevance, safety, and privacy. The chatbot integrates validation layers to filter and verify outputs, reducing the risk of misleading or unsafe medical advice. This makes the system suitable for healthcare-related applications where trust and accuracy are critical.
The backend is implemented using Flask, providing lightweight and efficient API endpoints for real-time interaction. The application is containerized with Docker and deployed on AWS EC2, ensuring scalability and reliability. Continuous integration and deployment (CI/CD) pipelines are managed באמצעות GitHub Actions, enabling automated testing and streamlined updates.
The system is designed following LLMOps best practices, including modular architecture, reproducible workflows, and efficient model integration. It supports real-time query handling while maintaining performance and responsiveness.
Overall, this project demonstrates how modern LLM technologies can be combined with retrieval systems and cloud infrastructure to build a robust, production-grade medical chatbot capable of delivering trustworthy and contextually accurate health information.
https://www.twine.net/signin
Project Description
AI Football Video Analysis System is a comprehensive computer vision solution designed to extract real-time insights from football match footage. The system combines advanced object detection, tracking, and spatial analysis techniques to monitor player behavior, estimate ball possession, and generate meaningful performance analytics.
At its core, the system uses a YOLOv11 model (Ultralytics) to accurately detect players, referees, and the ball in each frame. These detections are enhanced with tracking and motion analysis techniques to maintain consistent identities and capture dynamic interactions throughout the match.
To enable deeper analysis, the project integrates KMeans clustering to group players by team based on visual features such as jersey color. Optical flow is used to estimate movement patterns and player trajectories, providing insights into speed, positioning, and overall activity on the field.
A key component of the system is perspective transformation, which maps the broadcast camera view into a top-down representation of the pitch. This allows for more accurate spatial reasoning, including player positioning, distance measurements, and zone-based analysis.
The system generates real-time analytics such as ball possession statistics, player movement tracking, and match dynamics visualization. Results can be visualized through overlays on the video as well as plotted using data analysis libraries like Matplotlib and Pandas.
Developed in Python using tools such as OpenCV, NumPy, and Scikit-learn, the project is structured for flexibility and experimentation, with development carried out in environments like Jupyter Notebook and VS Code. Version control is maintained using Git for reproducibility and collaboration.
Overall, this project demonstrates how modern AI and computer vision techniques can transform raw sports footage into actionable insights, enabling performance analysis, tactical evaluation, and enhanced understanding of the game.
Project Description
Detection Transformers Fine-Tuning for Custom Object Detection is an end-to-end deep learning project focused on building a high-precision bone fracture detection system using Transformer-based architectures.
The project leverages DETR (Detection Transformer) models, fine-tuned on a custom dataset of approximately 1,200 COCO-formatted X-ray images. The system is trained to detect and classify five distinct types of bone fractures, enabling accurate and automated analysis of medical imaging data.
Built with PyTorch Lightning and Hugging Face Transformers, the training pipeline is modular, scalable, and optimized for experimentation. It integrates data preprocessing, model training, validation, and evaluation into a streamlined workflow, ensuring reproducibility and efficient development cycles.
To enhance usability and interpretability, the system incorporates OpenCV and Supervision for visual validation. Predictions are displayed alongside ground truth annotations, allowing side-by-side comparison that supports clinical insight and model transparency.
The model achieves over 90% precision while maintaining fast inference speeds (under 50ms per image), making it suitable for real-time or near-real-time diagnostic support applications. Visualization tools, including Matplotlib, are used to monitor training performance and analyze results.
The dataset preparation and annotation process follows the COCO standard, ensuring compatibility with modern detection frameworks and tools such as Roboflow. Development and experimentation are conducted in environments like Google Colab and Jupyter Notebook, with version control managed through Git.
Overall, this project demonstrates the effectiveness of Transformer-based object detection models in the medical imaging domain, delivering a reliable and efficient solution for automated fracture detection with strong potential for real-world clinical applications.
Project Description
Fine-Tuning Large Language Models (LLMs) Efficiently with Unsloth + LoRA is a high-performance pipeline designed to enable scalable and cost-effective customization of multi-billion parameter language models using limited computational resources.
This project focuses on parameter-efficient fine-tuning techniques, combining Unsloth optimization with LoRA (Low-Rank Adaptation) to significantly reduce memory usage and training time. By leveraging 4-bit quantization and mixed precision, the pipeline allows large models to be fine-tuned on a single GPU without compromising performance.
The workflow covers the entire fine-tuning lifecycle. It begins with structured dataset preparation in chat format, ensuring compatibility with conversational models. Training is performed using Hugging Face Transformers and TRL’s SFTTrainer, integrated with PEFT for efficient adaptation of model weights. The pipeline is optimized for stability and speed, making it suitable for rapid experimentation and iteration.
In addition to training, the project includes evaluation and validation steps to assess model performance, as well as flexible export options for deployment across different environments. Models can be saved and reused in multiple formats, supporting integration into real-world applications.
Designed with reproducibility and modularity in mind, the pipeline is implemented using Python and runs seamlessly in environments like Google Colab and Jupyter Notebook. Version control with Git ensures organized experimentation and collaboration.
Overall, this project demonstrates how modern optimization techniques can make advanced LLM fine-tuning accessible, enabling developers to adapt powerful language models efficiently for domain-specific tasks without requiring extensive hardware infrastructure.
Project Description
Real-Time Human Activity Recognition Video Data Annotation Tool is a professional desktop application designed to streamline the creation of high-quality annotated datasets for human activity recognition tasks in security and surveillance domains.
Built using PyQt5, the application provides an intuitive graphical interface that enables users to annotate multi-person video data efficiently in real time. It integrates YOLOv11 pose estimation models with OpenCV to detect and track human body keypoints across frames, allowing precise activity labeling with minimal manual effort.
The system is powered by a scalable, multi-threaded video processing pipeline that ensures smooth performance even with high-resolution or long-duration video streams. This architecture separates video decoding, model inference, and UI rendering, resulting in responsive and efficient annotation workflows.
A key strength of the tool lies in its robust data management capabilities. It supports COCO-compliant annotation structures, ensuring compatibility with modern deep learning pipelines. Automatic backup mechanisms safeguard project data, while a modular design allows easy extension and maintenance.
To maximize usability across different machine learning workflows, the tool offers multi-format export options, including YOLO, Pascal VOC (XML), and CSV. This flexibility enables seamless integration with various training frameworks and platforms such as Ultralytics and Roboflow.
The application also features a complete project-based workflow, allowing users to organize datasets, manage annotations, and maintain consistency across large-scale labeling tasks. By combining automation with user control, it significantly reduces the time and effort required to build training datasets for activity recognition models.
Overall, this project demonstrates a practical and scalable solution for real-time video annotation, bridging the gap between raw video data and structured datasets required for advanced human activity recognition systems.
Project Description
Apple Grading System is an end-to-end intelligent solution for automated apple quality assessment before harvest. It leverages computer vision and deep learning to detect apples in images, classify their health condition, extract visual features, and assign standardized quality grades through a user-friendly web interface.
The system follows a multi-stage pipeline where input images are processed using a fine-tuned YOLOv8 model for apple detection. Each detected apple is cropped and passed to a ResNet18-based classifier to identify diseases such as blotch, rot, and scab, or determine if the fruit is healthy. Additional image processing techniques are applied to extract key visual features like color quality and size estimation.
A rule-based grading engine combines classification results with extracted features (e.g., redness, uniformity, and size) to assign grades (A, B, or C), reflecting the overall quality of each apple. The final output includes annotated images, detailed per-apple analysis, and aggregated grading summaries.
The backend is built with FastAPI, providing RESTful endpoints for single and batch image analysis, while efficiently handling asynchronous uploads and model inference. Results are stored in a SQLite database using SQLAlchemy, enabling persistent history tracking and quick retrieval of past analyses.
On the frontend, a React-based single-page application offers an interactive experience with features such as drag-and-drop uploads, batch processing, visual result dashboards, history management, and PDF export capabilities.
The system is designed for scalability and deployment flexibility, with trained models exported in multiple formats including PyTorch, ONNX, TensorFlow, and TFLite for cross-platform compatibility.
Overall, this project demonstrates a complete AI-powered pipeline that integrates machine learning, backend services, and modern frontend development to deliver a practical agricultural quality control solution.
Hire a AI Engineer
We have the best ai engineer experts on Twine. Hire a ai engineer in Tunis today.