Inspiration

Every week, Montréal drivers face the same frustration — gas prices change daily with no way to know whether to fill up now or wait. After seeing prices swing 10+ cents in a single week, I wanted to build something that could actually answer: should I fill up today, or will it be cheaper on Thursday?

What it does

GasPrix MTL is a bilingual (EN/FR) ML web app that forecasts Montréal retail gas prices up to 4 weeks ahead. It has three features: a 4-week price range forecast with widening confidence bands, a best day to fill up recommendation showing which day this week is cheapest and how much you'd save, and a budget estimator that projects next month's fuel cost based on your last month's spend.

How we built it

We pulled daily Montréal gas prices from Kalibrate and WTI crude oil + CAD/USD data from FRED, then engineered 23 features including price lags, rolling statistics, and momentum indicators. We trained Ridge Regression, XGBoost, and LightGBM on 7.5 years of data and evaluated on a fully out-of-sample 2-year test set. The app runs on Flask + Render, with a GitHub Actions workflow that retrains models daily at 8AM EST and auto-deploys on commit — fully automated, no paid services.

Challenges we ran into

Hyperparameter tuning backfired — Optuna actually worsened LightGBM's MAE from 1.895 to 2.238 ¢/L due to a regime shift between the volatile 2016–2023 training era and the calmer 2024–2026 test period. Default parameters were kept. We also had to handle Kalibrate's inconsistent Excel layouts, FRED holiday gaps requiring 7-day forward-filling, and future price projections that needed clipping to prevent data leakage.

Accomplishments that we're proud of

Ridge Regression — a simple linear model — outperformed XGBoost and LightGBM, achieving 1.760 ¢/L MAE on a 2+ year out-of-sample test. Gas price autocorrelation is so strong that the relationship with lag features is essentially linear — complexity adds noise, not signal. LightGBM achieved 63.8% directional accuracy, correctly predicting price direction nearly 6 out of 10 days.

What we learned

  • A strong naive baseline is essential — the persistence model is deceptively hard to beat
  • Never shuffle time series data — always use a hard date cutoff
  • Feature engineering matters more than model complexity for this type of problem
  • Hyperparameter tuning can hurt when the test distribution differs from training

What's next for GasPrix MTL

Adding WTI futures and news sentiment to anticipate large price moves, experimenting with a rolling retrain window to reduce regime shift sensitivity, expanding to other Canadian cities (Vancouver, Toronto, Calgary), and exploring LSTM or Transformer architectures for longer-range forecasting.

Built With

Share this project:

Updates