In Progress

Current Projects

What I'm actively working on right now. These aren't finished yet, but this page shows what I'm building and what I'm focused on learning.

Active work

Most of what I'm building right now is in the sports analytics space — it's a good domain for practicing real data engineering because the data is messy, constantly updated, and there's a clear way to evaluate whether your model is actually working.

Active
MLB Prediction Model

A machine learning model for predicting MLB game outcomes using pitcher stats, team batting trends, and opponent-adjusted metrics. Built around a SQLite database with a modular Python pipeline.

Python SQLite LightGBM Pandas
Active
NBA Player Props Model

A player props prediction system using hoopR data. Instead of predicting raw stats, the model predicts each player's output as a percentage of their season average — which reduces bias across player tiers.

Python R / hoopR Scikit-learn Random Forest
Ongoing
Portfolio & Personal Site

This site. I'm continuing to build it out — adding project pages, improving the design, and eventually connecting it to some of the model outputs as live demos.

PHP HTML/CSS JavaScript
What I'm learning
Walk-forward validation and how to avoid data leakage in time-series models.
LightGBM tuning — feature selection, regularization, and tree structure settings that actually matter.
How to build reliable data pipelines that fail gracefully instead of silently producing wrong outputs.
R for sports data extraction using nflfastR and hoopR, then handing off to Python for modeling.