Skip to main content

Time Series Visualization

Sample workflows and comparisons of model predictions for time series data.
Created on March 7|Last edited on March 7
This use case exploration follows an excellent TensorFlow Colab to prototype the basics of working with time series data and models in Weights & Biases. I use a dataset of weather measurements (temperature, wind speed, humidity, etc) and compare model predictions across some simple neural networks (linear, convolutional, and recurrent).

Try the interactive colab →

Workflow setup and recommendations

Time series data is easier to visualize, organize, and understand in W&B with a few modifications to the standard workflow:

Save cleanest data as Artifact; load with custom function

Starting with the initial raw data, explore and clean up the features—e.g. change the format, fill in missing values, normalize—as much as possible before versioning the cleanest possible data as an Artifact.
You might log the whole dataframe as one Table:
wandb.log({"raw_data" : wandb.Table(dataframe=data_df)

or choose to fix a split into train/val/test (expand for code sample)

Set up fixed, named windows of validation data

To precisely compare model performance, designate a few specific windows—timestamped sequences of consecutive data points—as validation data. Log these as named wandb.Tables—e.g. I use "val_samples_0", "val_samples_1", "val_samples_2"... as my Table names. By keeping the name constant, I can guarantee that I am comparing the forecasts of different models on the same input data.

Log multiple formats: step index, timestamp, and human-readable

I found three formats for managing time stamps useful for different purposes, and I recommend logging all of them to each Table as distinct columns:
  • time step index: literally log the index of the input time step—this will be offset relative to any data window you've defined, so if your model reads in a sliding window of 3 data points, the first time step would be at index 3 and not 0.
  • timestamp: log the numerical timestamp as Python datetime.timestamp()—this can be converted to a human-readable timestamp from the Table UI with .toTimestamp()
  • human-readable timestamp: log the string version of the timestamp which you would like readers to see (the Table-supplied timestamp only has day-level resolution and its format cannot be customized)

Log ground truth as a wandb.Table in a distinct run

To compare model performance against the ground truth, log the ground truth values for each named window as a separate wandb.Table—e.g. I use a distinct run named "ground_truth" to log all the labels (five Tables of validation data).

Toggle true and predicted line series via PanelPlot

Create a PanelPlot from every named Table / window of validation data. On each PanelPlot, you can visually compare model predictions to each other and to the ground truth by toggling each run on/off independently. This run visibility is controlled through the left sidebar in the Workspace or the run set below a panel grid in a Report.
Below, each line can be shown or hidden by checking the box for each run name.

Ground truth
1
Baseline
1
Linear
1
Dense 64-64
1
Dense 128-64
1
LSTM 32
1
LSTM 64
1
LSTM 128
1


Detailed view with loss curves


Time series models
8


Flexible evaluation across data windows

Comparing a few hyperparameter settings for a toy CNN. For these training and evaluation runs, the model reads in three time steps at a time.

Convolutional models
5


Logging time series data

Please see the detailed Colab here.
To visualize time series data via wandb.Table, there are several options:

Log a Pandas DataFrame directly

This is the easiest path—use it when exploring your dataset and understanding features.
wandb.log({"timeseries_data" : wandb.Table(dataframe=data_df)

Log rows of data and explicit columns

This format is more precise to select the columns you care about and time stamps in the correct format.
data_with_time = [[data[i], timestamps[i]] for i in range(len(data))]
wandb.log({"timeseries_data" : wandb.Table(data=data_with_time,
columns=["temperature", "time"]