I am a data scientist (6 years of experience, 5 years in RWE) and a researcher in ageing and age-related diseases (13 years). I hold a PhD and have designed 13 and executed 26 data science/epidemiological projects, and authored 16+ research papers.
I have led teams including a small group of 2 data scientists, supervised MSc dissertations for 4 students who earned excellent grades, and collaborated with researchers across multiple teams. My technical expertise spans classification and regression, survival and time-series analyses, and deep learning (CNN, GAN, RNN).
Skills
Language
Work Experience
Education
Qualifications
Industry Experience
Networked Data Lab: NDL North West London
Impact of being an unpaid carer on health conditions and healthcare access in North West London
Project Description
The Network Data Lab (NDL, https://www.twine.net/signin is a pioneering collaborative network of analysts who use linked data, open analytics, and public and patient involvement to tackle the most pressing challenges in health and social care. The initiative is led by The Health Foundation working closely with five partner labs across the UK. The North West London Networked Data Lab (NWL NDL) is a partnership between Imperial College Health Partners (ICHP), North West London Health and Care Partnership, Imperial College’s School of Public Health, and the Institute of Global Health Innovation (IGHI).
The overarching aim of the NDL is to improve health and care services, and reduce health inequalities in the UK, with the current project specifically aiming to understand the needs, health issues and pathways to services of unpaid carers.
The aims of the study are to:
Explore the demographic profiles of unpaid carers as well as their geographical distribution in North West London
Estimate the effect of being a carer on health-related metrics and the risk of developing various long-term conditions
Analyse how the COVID-19 pandemic affects access of unpaid carers to healthcare services
To achieve these aims we extracted the healthcare data related to unpaid carers identified through a list of SNOMED codes in the Discover dataset. Discover data is the deidentified dataset which contains linked, coded primary care, secondary, acute, mental health, community health and social care records for over 2.5 million patients who live and are registered with a GP in North West London [4]. We also created a matched cohort based on gender, age, Index of Multiple Deprivation (IMD) and ethnicity to use as a control population for comparisons. The matched cohort contains professional carers, however, for brevity, in this study we refer to unpaid carers as carers and the matched population as non-carers.
Project Description
This repository contains the code for forecasting the the number of weekly endoscopies (and points reflecting the staff capacit required for endoscopy) for the next 5 yearshttps://www.twine.net/signin) based on the up to 10 years of historical data obtained from 6 providers in North West London. 3 different approaches were used for the modelling: Prophet, SARIMA and Exponential smoothing. The forecasts of best models from chosen sites/procedures were combined to predict the demand for each provider or the whole North West London.
Data sources
The data were received from the following providers: * Chelsea and Westminster NHS Foundation Trust * Imperial College Healthcare NHS Trust * London North West Healthcare NHS Trust * The Hillingdon Hospitals NHS Foundation Trust * Healthshare * NHS bowel cancer screening programme (BCSP)
Data
The data contain the date, procedure codes, procedure categories, patients numbers, points for each procedure for each of 5 datasets: referrals, rebookings, emergency, surveilance, removals.
The main project aim was to assess the risk of COVID-19-related hospitalisation and/or COVID-19-related death within 28 days of the observed/imputed treatment date between highest-risk patients treated and not treated with sotrovimab.
Inverse probability of treatment weighting (IPTW) was used to balance baseline patient characteristics in the treated and untreated cohorts. IPTW based on propensity scores was used to adjust for measured confounders between the treated and untreated cohorts. Propensity scores (probability of treatment based on baseline covariates) were obtained using logistic regression or gradient boosting machine models. Propensity score models were used to predict the probability of treatment based on the following covariates: age, gender, time period of COVID-19 diagnosis (i.e., Omicron BA.1, BA.2 or BA.5, as defined above), presence of renal disease (binary), presence of multiple highest-risk conditions (≥2, binary), presence of high-risk conditions (binary), solid organ transplant (binary), COVID vaccination status (binary), time since vaccination, and ethnicity (see full list of variables and models in the publication). To obtain an appropriate estimation of the variance of the treatment effect and better control the type I error rate, inverse probability of treatment weights were stabilised. The balance in baseline characteristics between weighted treated and untreated groups was assessed using standardised differences.
Cox proportional hazards models with stabilised weights were performed to assess the hazard ratio (HR) of COVID-19-related hospitalisation and/or COVID-19-related death. Covariates not balanced after weighting (standardised differences >0.1) were included in the Cox proportional hazards model. IPTWs and accordingly doubly robust estimation was performed separately for each Cox model.
Hire Evgeniy R. Galimov today
To get started post up your job and then invite Evgeniy R. Galimov to your job.