I am a data scientist and analyst with over 5 years of experience delivering high-impact data solutions across healthcare and financial services. I specialize in predictive modeling, ETL pipelines, and interactive BI dashboards that drive operational efficiency and strategic decision-making.
I am a people-focused communicator who translates complex technical findings into actionable insights for executives, and I thrive in cross-functional teams in fast-paced environments. I am excited to explore opportunities in New Zealand’s tech ecosystem and bring expertise in Python, SQL, and cloud platforms to solve real-world problems.
Skills
Experience Level
Work Experience
Education
Qualifications
Industry Experience
- Data Integration & Cleaning
- Feature Engineering
- Machine Learning Modeling
- Evaluation
Executive Summary
This project addresses a critical challenge in New Zealand’s agricultural sector: predicting dairy production fluctuations based on regional climate variations. Using a decade of historical climate and production data, I developed a machine learning model to forecast monthly yields, enabling farmers and supply chain managers to optimize resource allocation.
Business Problem
New Zealand is a global leader in dairy exports. However, production is highly sensitive to climate patterns (rainfall and soil moisture). Unexpected drops in yield lead to supply chain disruptions and financial losses. The goal was to build a predictive tool that provides a 3-month lead time for production forecasts.
Tech Stack
Languages: Python (Pandas, NumPy, Scikit-learn)
Visualization: Matplotlib, Seaborn, Power BI
Environment: Jupyter Notebook / AWS Sagemaker
Techniques: Time-Series Analysis, Random Forest Regressor, Feature Engineering
Methodology
Aggregated climate data (temperature, precipitation, sunshine hours) from NIWA (National Institute of Water and Atmospheric Research) with regional dairy production metrics.
Handled missing values using seasonal interpolation to maintain time-series integrity.
Created Lag Features: 1-month and 3-month lags for rainfall to account for the delayed impact of drought on pasture growth.
Engineered a “Pasture Stress Index”—a derived metric combining soil moisture and temperature.
Compared multiple models: Linear Regression, XGBoost, and Random Forest.
Winner: Random Forest Regressor yielded the best performance due to its ability to capture non-linear relationships between climate variables.
Hyperparameter Tuning: Used GridSearchCV to optimize tree depth and estimators.
Mean Absolute Error (MAE): Achieved an error margin of less than 4% on test data.
R-Squared: 0.89, indicating a strong correlation between climate features and production output.
Key Insights
Rainfall Lag: Rainfall from 2 months prior is the strongest predictor of current-month milk solids production.
Regional Variation: The Waikato region showed higher sensitivity to temperature spikes compared to the Southland region.
Impact
Operational Efficiency: The model provides a 90-day warning for potential yield drops, allowing for proactive feed-stocking.
Strategic Planning: Insights from the Power BI dashboard helped identify high-risk zones for seasonal drought impact.
Note: This project demonstrates my ability to apply Data Science to specific New Zealand industrial contexts.
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Lagos today.