Skip to main content

Virgin River Flow Forecasting and Analysis

The Virgin River creates a stunning backdrop as it winds through Zion National Park. The hiking trails around it attract thousands of hikers every year. Unfortunately, the narrows through Zion also creates a dangerous area when heavy rain rolls-in. There have been several fatalities and many near misses in flash floods in Zion over the last few years. Can machine learning help provide early and accurate alerts to hikers?
Created on December 31|Last edited on May 23

Background

Virgin River Zion

The Virgin River is a tributary of the Colorado River in the U.S. states of Utah, Nevada, and Arizona. The river is about 162 miles long. It was designated Utah's first wild and scenic river in 2009, during the centennial celebration of Zion National Park. - Wikipedia

Stretches of the Virgin River through Zion National Park in Utah are particularly susceptible to flash floods. Some of the more famous hiking trails at Zion traverse stretches of the Virgin River. Additionally, the flow in many canyons and other creeks in the area is likely to be at least somewhat correlated to the Virgin River's flow. Currently, flash flood warnings on the river are provided by the National Weather Service (NWS). However, these models are prone to errors particularly during large precipitation events and the NWS isn't exactly transparent about how they work. More accurate river flow forecasts potentially can provide more timely alerts that allow hikers additional time to reach safety.

Flash Flood Zion

Current Warning Systems

The NWS currently has forecasts; however, these are largely based on hydrological models that have difficulty forecasting accurately in the future. Additionally, the NWS doesn't have much information on how these models actually work. You can see an overview of their general methodology at this link. We are currently working on a scraper to download NWS forecast data (unfortunately, they don't seem to store their historical forecasts) to directly compare our models against theirs.

River Gages

image.png There are actually multiple USGS gages located on the Virgin River unfortunately all three of them are located down stream of Zion N.P. In this W&B report we will look at four major gages:

  • St. George Gage, this is the furthest gage from Zion N.P. However, it is located in fairly populated area with several bridges to forecasting its flow might also be beneficial.
  • Virgin River Near Virgin Utah: This gage is on the section of river downstream it is on the section of river directly down-stream of where the north fork and east fork converge.
  • North Fork of the Virgin River: This is the actual tributary of the Virgin River that flows through Zion National Park.
  • Coal Creek This is river gage is not along the Virgin River at all but is one of the few river gages north of Zion National Park. Therefore its flow might be somewhat correlated to the North Fork's Flow.

Another thing that we might try later is forecasting all four river gages at once to leverage the power of multitask learning.

Data and Methodology

We use river flow data from the existing FlowDB river flow library to train several river flow forecasting models. Specifically we use data from 2010-2019 to train the model and data from 2020 to test the model. In this case we specifically aim to forecast

Model Training

We will use Flow Forecast (don't be fooled by the name, this is an all-purpose, deep learning for time series forecasting framework, that can be used to forecast any time series data) to forecast the flow of the Virgin River and Clear Creek with several different models. When all is said-and-done we plan on examining the performance of five main models (and their probabilistic variants):

  • TransformerL
  • Convolutional Transformer
  • DA-RNN
  • Informer
  • Temporal Fusion Transformer
  • TFT Normalizing Flows

In time, we will add links to reports that cover each model. In the rest of this report, we will primarily look at the TransformerL. If you need a refresher, this is what the model looks like. We will also investigate model robustness through different interpretability plots and different forecast periods.



Training a transformer to forecast the North Fork of the Virgin River

The gage on the North Fork is the closest stream flow gage to Zion N.P. and the only gage that is on the North Fork Virgin River before its confluence with the East Fork. Therefore, this gage is likely the most important gage in terms of both river and stream flow. Below we see that a lower learning rate seems to help reduce MSE and a larger number of encoder layers work well. Surprisingly a longer forecast history does not seem to improve performance of the model.




Run set
1912


Investigating Best Performing Models

So let's look into what the model is looking when they predict the forecasted week in December. Interestingly most of these models do not seem to be weighing precipitation heavily when making predictions. The models seem to be instead relying heavily on temperature and some of our seasonality features. As we will see when we look at the how they perform when we forecast extreme precipitation events this could be a problem.

For reference:

honest-sweep-11 is the best sweep in terms of (rolling) MSELoss on the entire test (stretching from Jan-2020 to Jan-2021) forecasting a window of 100 hour into the future.

autumn-sweep-260 is the best sweep in terms of (rolling) DilateLoss on the entire test set (same parameters as above)..

different-sweep-370 is the best sweep in terms DilateLoss on just the plotted forecasted week in May/June (likely one of the reasons it looks so good).

divine-sweep-184 is the best sweep in terms of MSELoss on the plotted week in May/June.




Run set
4


How well do models forecast flow during intense precipitation?

To answer this question we will look at what the model predicts for periods when there was severe rain in the training/test set. To-do this we will Flow-Forecast inference mode. In addition, we simulate very large precipitation with Flow Forecast's synthetic data generator. image.png image.png

Based on the above graphics we can see that a major precipitation event took place during February of 2020 and another large one in March of 2020. We will see if models really use precipitation as a feature in these weeks and make a proper forecast.

First we will start with the February precipitation event:




Run set
4



Run set
4