Skip to main content

Time Series Forecasting in W&B

How to visualize time series data and analyze relevant models in W&B
Created on October 11|Last edited on October 18
Given a time series dataset of weather measurements (temperature, wind speed, humidity, etc), I follow this excellent TensorFlow tutorial to forecast future measurements with some basic neural networks—fully-connected single-layer ("linear") & multi-layer ("dense"), convolutional ("conv"), and recurrent ("lstm")—and compare their performance.

Visualize metrics over time on the x-axis

In addition to the standard W&B charts where the x-axis is numerical (scatter plot or bar chart, left) or is time-based relative to your main process (steps/epochs/raw time of training, right), you can now log and visualize arbitrary timestamps on the x-axis. The charts below compare the predictions of different model variants on a few time windows from the validation data.

024681012Epochs0.010.020.030.040.050.06Loss
dense_128_64lstm_128lstm_64lstm_32dense_64_64linearbaseline0.0600.0800.100
Time series models
8


[Toggle to expand]: Logging time series

Precise analysis of most relevant variables

Below are three highly correlated metrics (variants of vapor pressure)
  • VPact = actual vapor pressure, solid line
  • VPmax = maximum vapor pressure, dotted line
  • VPdef = deficit, or the difference between these two, dashed line
We can zoom into different regions to see detailed comparisons. Notice how the predictions of models of intermediate complexity (blues, especially violet) are closer to the black ground truth line than the simple models (red linear, green smallest dense model) and the overly large/complex models (magenta, largest layer sizes).

Baseline
1
Ground truth
1
Linear
1
Dense small
3
Dense large
2


Advanced: Customize points & labels

Our time series models can track multiple weather indicators:
  • temperature (T deg C)
  • atmospheric pressure (p mbar)
  • relative humidity (rh %)
  • wind velocity, horizontal component (Wx)
  • wind velocity, vertical component (Wy)
  • water vapor concentration (H2OC mmol)
In the charts below, we can compare how well our models predict all these different (and likely correlated) indicators, with the following legend:
  • ground truth: black line
  • baseline: light orange line, literally repeating the previous value for each timestep
  • linear model: red line, simplest one-layer approach
  • dense models: lines in blue, multiple layers of different sizes, closer to violet hue as model complexity increases
Observations:
  • we predict more stable/less variable metrics more closely
  • the more complex model (violet) is often—though not always—closer to the ground truth than the simpler models (red, light blue)
  • some lines are tricky to distinguish—we can swap out a symbol for the line of interest to make the chart easier to read

Multi-metric models
14


Artifacts for repeatable workflows & flexible evaluation

Version subsequences or time slices to simplify data management

  • save & version all time-sliced data as Artifacts
  • set the Artifact alias(es) to the timestamp, year, month, etc
  • easily configure which slice(s) are used for training and evaluation by referring to the correct alias in the relevant jobs
Organize data slices with aliases for training, validation, and test partitions

Evaluate dynamically on relevant subsets

Evaluate model performance on different time slices by specifying the relevant Artifact alias (any tag formatted as a string: e.g. the full date, month, year, precise timestamp)

Predictions across models for May, June & July (M5 - M7)

The first row of charts shows the ground truth in green, baseline in yellow, and model variants' predictions in blues. These are useful to view and somewhat tricky to distinguish, as all the predictions are fairly close together. For such cases, we can customize PanelPlot configuration to view derived expressions.
For June / M6, we compute the difference between the models' predictions and the actual temperature values observed during the corresponding time window. Each line on this plot shows the error in degrees Celsius. The blue baseline has the highest error, and the orange dense model is generally closest to the true values (the green y=0 line).



Dynamically compare, group & query models and predictions

Evaluate performance by dimensions of interest: model type, year, data split, & more

Group by alias: General improvement with more data

If we group training runs by the last year included in the training data, we confirm the intuitive hypothesis that increasing training data improves model performance (decreases loss and validation error).

Models with increasing training data
48


Group by model type: Specific effect of increasing data

With our flexible PanelPlot configuration, we can place the training dataset on the x-axis and group the model predictions by model type see how increasing training data affects our model variants differently:
  • linear model (aqua): biggest improvement from more data
  • large and small dense models (orange and red): modest improvement, no significant difference between the two model types
  • baseline model: no change (since it is rule-based)

Time series models by type
48


Bonus: Advanced grouping via Weave

Write powerful queries to flexibly slice and group all the data, experiments, and predictions logged to W&B. Below, we average temperature predictions across model type. We show the average as a solid line and the full range from min to max as a shaded region around it. The "large" models are shown in orange and the "small" models in aqua. Performance is very similar on this time window—perhaps the large models are slightly closer to the actual observed values (green line).




Bonus: Compute the diff between line series

Abhishek Khoyani
Abhishek Khoyani •  
@stacey I've 336 samples of temporal data with 3000 feature shape without absolute timestamp, overall train shape (336,3000) and it's binary classification problem so have labels with (336,1) shape. so how can I plot the line chart for the same. In python, we can do plt.plot(range(3000), Xtrain) , but here how can I indicate X-axis as just [0-3000]. I'll appreciate even if you can point to specific documentation for how to write weave expressions, didn't find any so!! Thanks.
1 reply