Skip to main content

New version of timestamps

Internal-only exploration of time series use cases in W&B
Created on May 7|Last edited on October 21
This use case exploration follows an excellent TensorFlow Colab to prototype the basics of working with time series data and models in Weights & Biases. I use a dataset of weather measurements (temperature, wind speed, humidity, etc) and compare model predictions across some simple neural networks (linear, convolutional, and recurrent). I summarize workflow recommendations, product challenges & opportunities, and potential next steps on the ProductML side.



Workflow setup and recommendations

Time series data is easier to visualize, organize, and understand in W&B with a few modifications to the standard workflow:

Save cleanest data as Artifact; load with custom function

Starting with the initial raw data, explore and clean up the features—e.g. change the format, fill in missing values, normalize—as much as possible before versioning the cleanest possible data as an Artifact.
You might log the whole dataframe as one Table:
wandb.log({"raw_data" : wandb.Table(dataframe=data_df)

or choose to fix a split into train/val/test (expand for code sample)

Set up fixed, named windows of validation data

To precisely compare model performance, designate a few specific windows—timestamped sequences of consecutive data points—as validation data. Log these as named wandb.Tables—e.g. I use "val_samples_0", "val_samples_1", "val_samples_2"... as my Table names. By keeping the name constant, I can guarantee that I am comparing the forecasts of different models on the same input data.

Log multiple formats: step index, timestamp, and human-readable

I found three formats for managing time stamps useful for different purposes, and I recommend logging all of them to each Table as distinct columns:
  • time step index: literally log the index of the input time step—this will be offset relative to any data window you've defined, so if your model reads in a sliding window of 3 data points, the first time step would be at index 3 and not 0.
  • timestamp: log the numerical timestamp as Python datetime.timestamp()—this can be converted to a human-readable timestamp from the Table UI with .toTimestamp()
  • human-readable timestamp: log the string version of the timestamp which you would like readers to see (the Table-supplied timestamp only has day-level resolution and its format cannot be customized)

Log ground truth as a wandb.Table in a distinct run

To compare model performance against the ground truth, log the ground truth values for each named window as a separate wandb.Table—e.g. I use a distinct run named "ground_truth" to log all the labels (five Tables of validation data).

Toggle true and predicted line series via PanelPlot

Create a PanelPlot from every named Table / window of validation data. On each PanelPlot, you can visually compare model predictions to each other and to the ground truth by toggling each run on/off independently. This run visibility is controlled through the left sidebar in the Workspace or the run set below a panel grid in a Report.
Below, each line can be shown or hidden by checking the box for each run name.

Ground truth
1
Baseline
1
Linear
1
Dense 64-64
1
Dense 128-64
1
LSTM 32
1
LSTM 64
1
LSTM 128
1


Visual examples

Detailed view with loss curves


Time series models
8


Flexible evaluation across data windows

Comparing a few hyperparameter settings for a toy CNN. For these training and evaluation runs, the model reads in three time steps at a time.

Convolutional models
5


Potential next steps

Prototype flexible & efficient data versioning

For simplicity, the current example assumes on one fixed version of the dataset and one possible path from raw data to the training batches. In practice, we need to support branching decisions and track meaningful hyperparameters and states at several levels. This can be accomplished via more detailed customization of Artifacts, Tables, run grouping/tagging, and specific recent prototypes like Dylan's workflow for versioned tabular data.

Feature engineering from preprocessing & format choices

This example converts the raw numerical wind speed, time of year, and day to sine / cosine wave forms to account for periodicity before creating the training data. It normalizes the distinct data splits before training. We may want to clean up, reformat, parameterize, or otherwise preprocess individual features or full splits differently before training.

Selection of input features X and predicted features Y

This example relies on 19 input weather features (wind speed, humidity, etc) to predict one output feature (temperature). We could use the same approach—same workflow, essentially identical code—to (1) explore and evaluate the significance/relevance of different input features, e.g. train on only 3 features, add in 3 more features to denote the geolocation, rank the features by their relative contribution and (2) predict additional labels, e.g. humidity in addition to temperature.

Sampling & sliding window configuration

We might be interested in varying:
  • the subsampling frequency—one per day, one per hour, one per minute—possibly with different choices across features
  • the size of the sliding window for model input: does the model look at 1 time step, 3 time steps, 24 time steps, etc in order to predict the next time step?
  • the size of the sliding window for model output: does the model predict one
  • time offset between input and prediction: we might read a full 24 hours' worth of data to predict the data for the next day, or a week's worth of data to predict the next hour, etc.

Train / val / test splits

Easily choose different subsets of the full raw data for training vs validation vs testing.

Integrating predictions and observations

In production, we will receive new real observations alongside our model's predictions We may want to explore active/online learning or otherwise dynamically integrate observed values to update our model.

Increase model complexity: compare to traditional, SOTA, or multi-modal (e.g. images) approaches

The specific models are toy examples: tiny networks with simple dense, convolutional, and LSTM architectures. A more in-depth report could explore:
  • comparison to traditional/statistical approaches: LightGBM/XGBoost, ARIMA, etc.
  • SOTA approaches with transformers, RL, latest publications on other NN variants
  • more complex or multimodal: e.g. satellite images, heatmaps, or videos at each time step, instead of or in addition to tabular features

Prototype active/online learning workflows

Once a model is deployed, how do we combine its predictions with actual observations? How do we retrain/improve the model as we accumulate ground truth over time, or perhaps dynamically integrate the real-time data to adjust predictions? How do we test larger hypotheses on or orchestrate larger updates to the full model registry—e.g. if we discover that we can safely lower sampling frequency by 20% and save a bunch of storage/compute costs, or if we build a new weather monitoring station and want to incorporate its data?

More complex & real-world use cases

Collaborations and in-depth case studies with existing projects—like forecasting energy demand for the grid, output from existing solar farms, changes in ground or cloud cover from satellite over time—would make for more compelling reports, help us better understand and address actual customer needs, and provide an exciting opportunity to support the climate/environmental sustainability space.

P.S. Logging time series data

Please see the detailed Colab here.
To visualize time series data via wandb.Table, there are several options:

Log a Pandas DataFrame directly

This is the easiest path—use it when exploring your dataset and understanding features.
wandb.log({"timeseries_data" : wandb.Table(dataframe=data_df)

Log rows of data and explicit columns

This format is more precise to select the columns you care about and time stamps in the correct format.
data_with_time = [[data[i], timestamps[i]] for i in range(len(data))]
wandb.log({"timeseries_data" : wandb.Table(data=data_with_time,
columns=["temperature", "time"]
Pierre Enel
Pierre Enel •  
Thanks for the update. It would be great to have a timestamp (date) for this blog post, just to not how recent this update is. Thanks!
1 reply