Hi there — I’m Francisco Luís de Avelar Cardoso
I’m a Lisbon-based data scientist and machine learning practitioner specializing in time-series forecasting, quantitative modeling, and end-to-end ML systems.
I have over two years of hands-on experience building and deploying predictive models across cryptocurrency markets and quantitative trading, with a strong focus on turning theoretically sound models into practical, production-ready solutions. My work spans classical econometrics and modern deep learning, including ARIMA-family models, LSTMs, quantile regression, and hybrid forecasting frameworks implemented in PyTorch, TensorFlow, and Statsmodels.
In industry, I’ve designed production LSTM models for multi-asset price forecasting, automated large-scale financial reporting pipelines that reduced manual reconciliation by over 90%, and built ML systems that integrate model training, evaluation, explainability, and deployment. I also developed novel forecasting evaluation metrics (FIS/CER) aimed at measuring financial usefulness rather than pure statistical accuracy, achieving significant improvements in risk-adjusted returns compared to traditional error metrics.
My academic background includes an MSc in Mathematics-Economics (ML track) and a Post-Graduate Program in AI & Machine Learning, with deep exposure to numerical methods, stochastic processes, optimization, and statistical learning. I’m particularly interested in roles that sit at the intersection of forecasting, decision-making, and real-world impact, where modeling rigor and engineering discipline matter equally.
Skills
Work Experience
Education
Qualifications
Industry Experience
ChurnGuard is a production-grade machine learning system for customer churn prediction, lifetime value estimation, and behavioral segmentation.
The platform includes:
Feature engineering pipelines for mixed tabular and temporal data
Gradient-boosted and deep learning models with Optuna tuning
Model tracking and versioning via MLflow
A FastAPI backend and Streamlit frontend
Containerized deployment (Docker) with cloud-ready architecture
Designed to demonstrate how ML models move from notebooks to real products, with emphasis on reproducibility, interpretability, and business impact.
A Retrieval-Augmented Generation (RAG) system for querying and summarizing real-world clinical trials in clear, patient-friendly language.
This project ingests data from ClinicalTrials.gov and combines:
Semantic retrieval using SentenceTransformers and ChromaDB
Local open-source LLMs (Gemma / Phi) for grounded text generation
A lightweight Gradio web interface for interactive exploration
Users can ask natural-language questions (e.g. “diabetes trials in Europe”) and receive concise summaries synthesized from multiple relevant studies, with filters for conditions and adjustable retrieval depth.
Designed as an applied NLP + healthcare ML system, emphasizing:
End-to-end data pipelines (API ingestion → parsing → embeddings → inference)
Practical RAG architecture
Responsible AI usage in medical contexts
A full deep learning pipeline for predicting the daily directional movement of the VIX volatility index.
The system uses:
Engineered features from SPY, QQQ, and VIX
A PyTorch LSTM backbone with an MLP classifier head
Rolling-window time series cross-validation to avoid leakage
Includes an interactive Streamlit dashboard for model diagnostics, fold-level analysis, and configuration comparison, showcasing realistic validation practices in financial ML.
An applied computer vision and multimodal learning project for automated skin lesion classification.
This project compares:
A CV-only ResNet50 model trained on dermatoscopic images
A multimodal model combining images with clinical metadata (age, sex, localization)
Includes a Streamlit dashboard for EDA, model comparison, confusion matrices, per-class F1 scores, and live predictions, highlighting the practical trade-offs between visual and tabular information in medical AI.
An end-to-end quantitative finance dashboard combining deep learning, econometrics, and NLP.
This Streamlit application integrates:
LSTM + quantile regression for multi-asset price forecasting
RoBERTa-based sentiment analysis to track market psychology
Regime change detection using structural break analysis
Built to demonstrate how modern ML models can be validated, interpreted, and used in real-world trading and risk management workflows, with interactive visualizations and multi-asset support (crypto, FX, equities).
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Odivelas today.