Python Time Series Forecasting: A Practical Approach

In this article, we'll dive into the world of time series data and learn to perform time series forecasts using various tools and techniques available in Python.
Madhana Bala S K
Created on November 2|Last edited on April 9
Comment
﻿Time series forecasting is the process of using past data to make predictions about future outcomes. It has various applications in industries like health care, finance, economics, retail, weather forecasting, and many other domains. 
Time series is often used to predict a wide range of phenomena, such as demand forecasting for retail, stock prices and weather patterns. It is quite different from regular tabular data due to its unique characteristics such as temporal context, trends and patterns, etc. 
In this article, we'll start by understanding what differentiates time series from static data and will go through a step-wise implementation of forecasting Bitcoin price.
Here's what we'll cover: 
Table of ContentsWhat Is a Time Series?How To Split Test-Train Sets?Single Train-Test SplitMultiple Train-Test SplitWalk-Forward ValidationTime Series ComponentsTime Series DecompositionAdditive and Multiplicative Decomposition MethodSeasonal-Trend Decomposition MethodStationaryCheck for Stationarity in Time SeriesWhat if the Data Is Non-Stationary?Single Step (One-Step) and Multi StepUnivariate and MultivariateUnivariate Time Series DataMultivariate Time Series DataBTC Price Forecast With PythonStep 1: Problem Statement and Hypothesis Step 2: Importing Libraries and W&B Set-UpStep 3: Data EngineeringStep 4: Feature EngineeringStep 5: Model DevelopmentStep 5: Hyperparameter optimization using W&B SweepsStep 7: Model ValidationSummaryReferences and Recommended Reads
﻿
﻿
Let's dig in! 
What Is a Time Series?Time series is a type of data set in which we observe one or more events or variables over a period of time intervals. These time intervals are mostly equally spaced (hourly, daily, weekly, monthly, quarterly, etc.). We use time series to analyze trends and patterns over time. 
This includes changes in stock prices, temperature fluctuations, or website traffic. Time series data is most often used in forecasting and prediction, as it can help to identify trends and patterns that may be used to predict future outcomes.
Note: In cases where the time series is unevenly spaced, we can transform the data using interpolation methods then apply existing models for equally spaced data. We can use Traces library to transform unevenly-spaced times series data to evenly-spaced representations.
💡
Since each data point in the time series corresponds to a time variable that is chronologically ordered, the data points are almost always correlated with each other. We can calculate correlation within the same series (autocorrelation: correlation with past observations) using the  autocorrelation function or partial autocorrelation function (ACF, PACF) and cross correlation function to check the correlation between two different series. These functions assume the time series to be stationary, so it's important we check for stationarity in time series before our autocorrelation analysis. 
Autocorrelation analysis is a crucial step for building classical time series models that require AR and MA parameters, such as ARMA. Check out this Kaggle notebook by Leonie, to learn how to use autocorrelation analysis to find the order of AR and MA models.
💡
In most machine learning use cases, it's common to randomly split the data set into train and test splits as this improves the performance of the model and makes sure that the model isn't biased. But in time series, since there's a relation between each data point due to the time component and are not independent of each other (temporal context), we can't just split them randomly. 
In fact, random splits can potentially cause "look ahead bias," a type of bias that occurs when a study or simulation relies on data that was not yet known or available during the time period being studied.
How To Split Test-Train Sets?There are several ways to split time series data, depending on the specific use case and the availability of resources such as the quantity of the data, computational resource availability, etc. Here are three common methods to split time series data:
Single train-test split
Multiple train-test split
Walk-forward validation
We'll discuss each in more detail below, starting with single train-test split.
Single Train-Test SplitA simple approach that uses the first part of the dataset for training and the latter for the test. 
For example, if we have observations of temperature sensors for 10 hours, we can use the first 8 hours to train and 2 hours for validation. It's important that this split must respect the temporal context of the data. 
# Split train-test set manually
train_data, test_data = df_close[:int(len(df_close)*0.9)], df_close[int(len(df_close)*0.9):]
# Plot train-test using matplotlib.pyplot
plt.figure(figsize=(10,6))
plt.grid(True)
plt.xlabel('Dates')
plt.ylabel('Closing Prices')
plt.plot(df_close, 'green', label='Train data')
plt.plot(test_data, 'blue', label='Test data')
plt.legend()
Image from author | Colab﻿
Multiple Train-Test SplitCreating multiple train-test splits increases the number of models trained and generally improves the model performance. 
We can manually create multiple train-test splits by repeating the process mentioned in previous method by setting different split points. Alternatively, Scikit-learn provides TimeSeriesSplit object for time series cross-validation (K-fold cross-validation alternative for time series ) which takes care of this process.
# Extract Closing values and Create TimeSeriesSplit splits
X = df_close.values
splits = TimeSeriesSplit(n_splits=3)
﻿
plt.figure(1)
index = 1
# Loop over splits to update train-test splits and models
for train_index, test_index in splits.split(X):
	# Create and update train-test splits
	train = X[train_index]
	test = X[test_index]
	print('Observations: %d' % (len(train) + len(test)))
	print('Training Observations: %d' % (len(train)))
	print('Testing Observations: %d' % (len(test)))
	# Enter model and evaluation technique here
﻿
	# Subplots for each train-test splits
	plt.subplot(310 + index)
	plt.plot(train,'green')
	plt.plot([None for i in train] + [x for x in test],'blue')
	index += 1
plt.show()
﻿
Image from author | Colab﻿
Walk-Forward ValidationOne limitation of the train-test split approach is that the models are trained on a fixed set of data and then evaluated on a fixed set of test data. This means that the models do not change as they are evaluated on each iteration of the test set. 
A potential issue with using a fixed train-test split is that it may not accurately reflect the real-world situation where models are updated with new data as it becomes available. This is where walk-forward validation comes in. In this method we continually train the forecast model on new data as it becomes available. There are two major ways in this method:
Sliding WindowIn the sliding window method, data is divided into a series of overlapping training and validation sets. For example, the first training set might include the first 10 data points, while the first validation set might include the next 5 data points. 
The process is then repeated, using the most recent training set as the starting point for the next iteration. This method allows the model to continually learn from new data as it becomes available, while still providing a way to evaluate its performance on out-of-sample data.
# Walk Forward Validation - Sliding Window
close_values = df_close.values
window = 200
horizon = 20
step_size = 20
start_index = 0 
anchor_index = start_index + window - 1
end_index = anchor_index + horizon
﻿
index = 1
plt.figure(1)
while(end_index<len(close_values)):
  # Create and update train-test splits
  train = close_values[start_index:anchor_index]
  test =  close_values[(anchor_index+1):end_index]
﻿
  #Use your model and validation technique here
﻿
  print('train=%d, test=%d' % (len(train), len(test)))
  # Subplots for train-test splits
  plt.subplot(5,1,index)   
  plt.grid(True)
  plt.xlabel('Dates')
  plt.ylabel('Closing Prices')
  plt.plot(close_values[:start_index],'red')
  plt.plot([None for i in close_values[:start_index]]+[x for x in train], 'green', label='Train data')
  plt.plot([None for i in close_values[:start_index]]+[None for i in train] + [x for x in test], 'blue', label='Test data')
  index += 1
﻿
  # Update train-test index
  start_index = start_index + step_size
  anchor_index = start_index + window - 1
  end_index = anchor_index + horizon
﻿
plt.legend() 
Image from author | Colab﻿
Expanding WindowIn the expanding window method, the training set is expanded for every iteration to include more data as it becomes available. For example, the first training set might include the first 10 data points, while the first validation set might include the next 5 data points. The second training set would then include the first 15 data points, and so on. 
This method takes advantage of these new data by training the model with them but, the training and evaluation time increases as the training set grows.
# Walk Forward Validation - Expanding Window
X = df_close.values
n_train =300
n_records = len(X)
index = 1
﻿
plt.figure(1) 
for i in range(n_train, n_records):
﻿
  # Create and update train-test splits
  train, test = X[0:i], X[i:i+5]
  print('train=%d, test=%d' % (len(train), len(test)))
﻿
  # Use your model and validation technique here
  
  # Subplots for train-test splits
  plt.subplot(5,1,index)   
  plt.grid(True)
  plt.xlabel('Dates')
  plt.ylabel('Closing Prices')
  plt.plot(train, 'green', label='Train data')
  plt.plot([None for i in train] + [x for x in test], 'blue', label='Test data')
  index +=1
﻿
plt.legend()
﻿
Image from author | Colab﻿
Time Series ComponentsThis step is a part of exploratory time series analysis, as we will be finding the underlying patterns and their behaviors in the time series by splitting the series into various components. Which, in turn will help us to decide how to approach the problem and to develop more accurate forecast.
By using time series decomposition, we split the data into four main components:
Trend: Trend represents a long-term increase or decrease in the time series. In the example below we can see that the trend i.e., the number of passengers travelled have increased from 1960 to 1970, which shows a positive trend.
Seasonality: Seasonality shows a uniform repeating  pattern at a constant frequency in the time series. In the example below, we have a seasonality with a frequency of 12 months, indicating that there's an increase in number of passengers traveling during vacation months.
Cycles: In simple words, datasets that contains seasonality with no fixed frequency are cyclic data.
Remainder: Remainder (residual or error) is what we get when we remove the trend, seasonal and cyclic component from a time series. It's the irregular variation in the time series, which can be helpful in finding anomalies (outliers, discordant observations, exceptions, aberrations) within the data.
Air Passenger Data | Source﻿
﻿
Time Series DecompositionLet's see how to decompose a time series data into it's components using statsmodels library in Python. We will be using air passengers dataset from Kaggle for this example and you can follow along with these steps using this colab notebook.
﻿
Additive and Multiplicative Decomposition MethodSimply put, this method assumes the components of time series are additive/multiplicative, meaning that they can be added/multiplied together to produce the original time series. Additive decomposition method is typically used when the trend and seasonality component stays constant over the period of time and multiplicative decomposition is used if  they are increasing or decreasing over time.
Additive Decomposition: Y=T+S+RY=T+S+RY=T+S+R﻿﻿
Multiplicative Decomposition: Y=T∗S∗RY = T*S*RY=T∗S∗R﻿﻿
# out-of-the-box decomposition using statsmodel seasonal_decompose function
from statsmodels.tsa.seasonal import seasonal_decompose
﻿
results = seasonal_decompose(df.Passengers, model='multiplicative') # Try using model='additive'
results.plot();
Image by author | Colab﻿
Although this method works fine for some datasets, it still has its own limitations and may underperform in certain cases. 
It assumes fixed seasonal periods
It assumes the seasonal component to be constant
It assumes the trend to be linear
Also if the dataset has multiple seasonal patterns the additive/multiplicative method may not produce accurate results.﻿﻿
Seasonal-Trend Decomposition MethodTo overcome the shortcomings of the classical additive/multiplicative decomposition method,  “STL: Seasonal-Trend Decomposition Procedure Based on Loess” was introduced. This method can handle any type of seasonality and we can also control the rate of change of the seasonal component to better match the seasonal characteristics of our data. 
from statsmodels.tsa.seasonal import STL
﻿
df = df.asfreq('MS') # Set frequency
# Set robust to True to handle outliers
res = STL(df, robust = True).fit() 
res.plot()
plt.show()
Image by author | Colab﻿
StationaryA stationary time series is a series which has constant statistical properties (such as trend, seasonality and covariance) and do not change over time. A stationary time series should have a constant mean and standard deviation. Why should we even bother if a time series is stationary or not?
It is often important to determine whether a time series is stationary or not because the statistical properties of a time series can change over time. A stationary time series is one in which the statistical properties, such as the mean and variance, are constant over time. This means that the patterns and trends in the time series are consistent over time, and are not influenced by external factors.
On the other hand, a non-stationary time series is one in which the statistical properties are not constant over time, and may be influenced by external factors such as seasonality or trends. Non-stationary time series can be more difficult to analyze and forecast, as the patterns and trends may change over time.
There are several reasons why it is important to determine whether a time series is stationary or not:
Statistical analysis: Many statistical techniques and models, such as linear regression and time series forecasting, assume that the data is stationary. If the data is non-stationary, these techniques may not be reliable or may produce misleading results.
Forecasting: Forecasting techniques often rely on the assumption that the patterns and trends in the data are consistent over time. If the data is non-stationary, it may be difficult to make accurate predictions about future outcomes.
Data preprocessing: Many time series analysis techniques, such as differencing and detrending, are used to transform non-stationary time series data into stationary data. Knowing whether a time series is stationary or not can inform the choice of data preprocessing techniques to use.
Overall, determining whether a time series is stationary or not is important because it can affect the reliability and accuracy of statistical analysis and forecasting techniques, and can inform the choice of data preprocessing techniques to use.
Check for Stationarity in Time SeriesThere are several techniques to check for stationarity in time series data:
Visual Check: This is a simple method where we plot the time series data and look for trend, seasonality and varying mean and variance. If any of these are observed, the data is likely non-stationary.
Statistical Tests: This method is better than the previous method as we will be using statistical test to find the presence of unit root in our data. Unit root is a characteristic of non-stationary time series, and its presence means that the data has a trend or long term dependency. 
Common Statistical Tests:  Augmented Dickey-Fuller test (ADF), Kwiatkowski-Phillips-Schmidt-Shin test (KPSS), Phillips-Perron test (PP), Zivot-Andrews test.
💡
Augmented Dickey-Fuller Test for StationaryThis is a type of Unit root test which determines how strongly a time series is defined by a trend. The null hypothesis of the ADF test is that there is a unit root, and a low p-value (usually below 0.05) would indicate that the null hypothesis can be rejected, and the data is considered to be stationary. Check out this section of Colab for implementing this test in python.
What if the Data Is Non-Stationary?Non-Stationary time series data can still be used to build forecasting models. They would require additional data pre-processing steps to remove the trend or seasonality present in it. Here are some common methods to convert a non-stationary time series into a stationary one:
Differencing: This method involves subtracting the value of the time series at a previous point in time from the current time series value, in order to remove trends in the data.
Log transformation: This method involves taking the logarithm of the values of the time series, this removes the exponential growth present in the data.
Moving average/Rolling mean: This method involves calculating the average of the time series over a certain window of time and subtracting it from the original time series, to remove trends in the data.
Decomposition: This method involves breaking down the time series into its trend, seasonal and residual components. The trend and seasonal components are removed and the residuals are used as the stationary time series.
Single Step (One-Step) and Multi StepWhen it comes to forecasting, we can either forecast just the next occurrence or a series of occurrences. Single Step models are used to predict the next step i.e., only one time-step is to be predicted. Likewise, when we use a model to predict multiple steps at once it's a Multi Step model. 
For example, given the temperature information for 8 hours a single step forecasting model will only predict the temperature of the 9th hour, i.e., only one time-step into the future, whereas a multi step forecasting model can predict the temperature for multiple time-steps into the future. Some models are good at predicting a single value whereas others are good at predicting a series of occurrences. 
We can use the single step models to predict multiple occurrences using methods like Recursive Multi-step Forecast, but this would reduced the accuracy of the prediction as we will be predicting the next step using an already predicted value which can cause the errors to add up after each step. Therefore it is better to use Multi Step models for predicting multiple steps at once and it is important to consider our use case and decide the number of steps we need to predict before starting our model building process.
Univariate and MultivariateA Univariate time series is a time series that consists of single observances recorded consecutively over (often equal) periodic intervals and a Multivariate time series is a time series that consists of SET of observances recorded consecutively over (often equal) periodic intervals. 
In simple terms, when there's only one time dependent variable in our time series data, then it's an Univariate time series data and if there's more than one time dependent variable, it's an multivariate time series data. Consider multivariate time series models as univariate models that consists external variables that has the potential to influence the accuracy of our predictions. For example, You can find examples for Univariate and Multivariate time series data below.
Univariate Time Series Data﻿Daily Minimum Temperatures in Melbourne Dataset:  This dataset consists of daily minimum temperature measurements from  weather stations in Melbourne, Australia. It can be used to study temperature trends and patterns over time.
﻿Monthly Sunspots Dataset: This dataset consists of monthly sunspot counts from 1749 to 1983. It can be used to study solar activity and its impact on Earth. 
Multivariate Time Series Data﻿Air Quality Dataset: This dataset consists of hourly air quality measurements at multiple monitoring stations in the United States. It can be used to study the relationships between different air pollutants and their impacts on public health. The dataset is available from the UCI Machine Learning Repository.
﻿Traffic Flow Dataset: This dataset consists of hourly traffic flow measurements for multiple roads and intersections in the United States. It can be used to study traffic patterns and predict future traffic flow. The dataset is available from the UCI Machine Learning Repository.
Now that we have covered the key topics, let's start our Bitcoin price forecasting example! 
Note: I've used an separate notebook for the example below, please use this colab to follow along!
💡
BTC Price Forecast With PythonPlease note that the following section is for educational purpose only and should not be considered as financial or investment advice. It is important to conduct your own research and seek advice from an financial professional before making any investment decisions. 
Step 1: Problem Statement and Hypothesis The goal is to develop a model that can predict future price of BTC using historical data. To be more specific, we're going to predict the Closing price of BTC by using the historical BTC price data collected using yfinance API. We will be creating a basic ML pipeline that includes Data Engineering (data collection, storage, and transformation), Feature Engineering, Model Development and Evaluation. 
The historical prices of Bitcoin are a complex phenomenon as it's affected by a wide range of factors such as human emotions, news, and regulations. Also, the highly speculative nature of the market makes it harder to find trends and seasonality. Since the data is complex and we lack domain expertise, we will start with a simple model,  iterate and improve it as we gather more insights and knowledge on the same. 
Step 2: Importing Libraries and W&B Set-UpAs always, we will start our script by importing all necessary packages. This includes: yfinance for data collection, scikit-learn and xgboost for model building, matplotlib and plotly for visualization and finally WandB for experiment tracking.
﻿
# install yfinance and plotly using pip
!pip install yfinance
!pip install plotly
﻿
# install wandb, xgboost, sklearn
!pip install wandb
!pip install xgboost
!pip install scikit-learn
﻿
# Core packages
import numpy as np
import pandas as pd
from pathlib import Path
import os
﻿
#Data Eng
import yfinance as yf
﻿
#Data viz
import plotly.graph_objs as go
from matplotlib import pyplot as plt
﻿
#Model
from sklearn import model_selection
import xgboost as xgb
from xgboost import plot_importance
from xgboost.callback import EarlyStopping
from sklearn.metrics import mean_squared_error,mean_absolute_error,r2_score
﻿
#Wandb
import wandb
from wandb.xgboost import WandbCallback
﻿
# os.environ['WANDB_NOTEBOOK_NAME'] = 'Notebook_Path'
﻿
In this example, we will also learn how to integrate W&B with our time series forecast pipeline. W&B can help us track our experiment, version our datasets and models after each iterations and saves a lot of time! I will be showcasing how to log our data, model, plots, perform hyperparameter optimization, reproduce model and finally analyze our experiment using WandB dashboard. 
If you're completely new to Weights and Biases, you'll have to create an W&B account first. I recommend going through this Quickstart guide, all it takes is 5 mins and you're golden!
After setting up our account, we will have to login to the W&B library on your machine. This step requires your API key, which you can find here.
PS: If you're using Colab, you can skip this step as W&B automatically authenticates your run-time (when we call wandb.init) if you're currently logged in to W&B in your browser. You get a few more perks for using Jupyter with weights and biases, which you can see here.
# Weights and Biases Log-in
wandb.login()
﻿
ENTITY = None # ENTER Your team name here if you have one.
Step 3: Data EngineeringNow, let's create our data frame using the yf.download function. We pass 'BTC-USD' ticker to the function and mention the start and end dates for our data. 
# Download historical Crypto/Share price using yfinance
df = yf.download(tickers='BTC-USD', start="2022-10-01", end="2022-12-20")
Now let's use Plotly to make an interactive candlestick - OHLC display. Here we pass our data frame columns Open, High, Low and Close to add_trace function. 
So, here's the thing with W&B. Every time you need to initialize an run, you'll have to use wandb.init() and we can start tracking our metrics using wandb.log() . When you feel that you're done you can use run.finish() to end the run. 
You can also log matplotlib and plotly plots apart from using W&B custom charts. As you can see, I've used wandb.log({"chart":fig}) to log the Plotly plot. We can also share these logged data (metrics and charts) using Reports by following these steps:
Go to your run's dashboard
Click more actions
Then add to your reports
﻿
# Initialize wandb run
run = wandb.init(project="Time_Series")
﻿
#declare figure
fig = go.Figure()
﻿
#Candlestick
fig.add_trace(go.Candlestick(x=df.index,
                open=df['Open'],
                high=df['High'],
                low=df['Low'],
                close=df['Close'], name = 'market data'))
﻿
# Add titles
fig.update_layout(
    title='Bitcoin Live Price',
    yaxis_title='Bitcoin Price (USD)')
﻿
# X-Axes
fig.update_xaxes(
    rangeslider_visible=True,
    rangeselector=dict(
        buttons=list([
            dict(count=1, label="1d", step="day", stepmode="todate"),
            dict(count=7, label="7d", step="day", stepmode="backward"),
            dict(count=1, label="1m", step="month", stepmode="todate"),
            dict(count=6, label="6m", step="month", stepmode="backward"),
            dict(count=1, label="1y", step="year", stepmode="backward"),
            dict(step="all")
        ])
    )
)
﻿
#Show
fig.show()
﻿
# Log plotly chart
wandb.log({"chart":fig})
﻿
# End run
run.finish()
﻿
﻿
3.1 Log our data as W&B ArtifactSince, we will be making multiple attempts at finding the best workflow for our problem, we will also be generating multiple transformed datasets. Being a clumsy person myself, I always tend to loose track of all these datasets. Therefore, it's really handy that we can save our datasets using Artifacts and worry less about dataset versioning.
# Save our BTC price data locally
data_dir = Path('.')
data_path = data_dir/'df.csv'
df.to_csv(data_path,index=False)
Use wandb.Artifact() function to create an artifact, then use .add_file() to add our dataset and finally log our dataset using run.log_artifact(artifact)
run = wandb.init(project="Time_Series")
# Create a new artifact for the data
artifact = wandb.Artifact('BTC', type='dataset')
# Attach our data to the Artifact 
artifact.add_file(data_path)
# Log this Artifact to the current wandb run
run.log_artifact(artifact);
﻿
run.finish()
﻿
Now let's do an Single train-test split and log them using W&B artifact.
# Train and Validation Data Splits
train_df, valid_df = df[:int(len(df)*0.9)], df[int(len(df)*0.9):]
print("Number of samples in train_df: ", len(train_df))
print("Number of samples in valid_df: ", len(valid_df))
﻿
with wandb.init(project='Time_Series') as run:
﻿
  # Save split datasets
  train_path = data_dir/'train.csv'
  valid_path = data_dir/'val.csv'
  train_df.to_csv(train_path, index=False)
  valid_df.to_csv(valid_path, index=False)  
﻿
  # Create a new artifact for the data
  artifact = wandb.Artifact(name='BTC_train_valid', type='dataset')
﻿
  # Attach our data to the Artifact 
  artifact.add_file(train_path)
  artifact.add_file(valid_path)
  
  # Log this Artifact to the current wandb run
  run.log_artifact(artifact);
Now let's log a sample of our train dataset using W&B Tables, this saves us the trouble of executing all the previous data transformation steps and then use train_df.head() to have a look at our data.
# Create a wandb run
run = wandb.init(project='Time_Series')  
﻿
# Create a W&B Table and log 10 random rows of the dataset to explore
table = wandb.Table(dataframe=train_df.sample(10))
﻿
# Log the Table to your W&B workspace
wandb.log({'train_dataset': table})
﻿
# Close the wandb run
wandb.finish()
﻿
﻿
﻿
Step 4: Feature EngineeringI assume that most of the learners here already have some degree of experience with machine learning and in turn are more comfortable with Supervised Models such as XgBoost than classical time series models such as ARIMA and SARIMAX. Hence I am selecting xgboost as my baseline model. 
Since we're using xgboost, it'd be nice if we can generate more features. Let's create a function create_features for the same. We'll use the date column to make new features such as day-of-month and day-of-year which might have an correlation with our target value - Closing price.
def create_features(df, label=None):
    """
    Creates time series features from date index
    """
    df['date'] = df.index
    df['dayofweek'] = df['date'].dt.dayofweek
    df['quarter'] = df['date'].dt.quarter
    df['month'] = df['date'].dt.month
    df['dayofyear'] = df['date'].dt.dayofyear
    df['dayofmonth'] = df['date'].dt.day
    
    X = df[['dayofweek','quarter','month',
           'dayofyear','dayofmonth']] 
    if label:
        y = df[label]
        return X, y
    return X
Use the create_features function to make X_train, y_train, X_valid and y_valid datasets.
X_train, y_train = create_features(train_df, label='Close')
X_valid, y_valid = create_features(valid_df, label='Close')
Since we are going to make predictions, we need a method to evaluate the accuracy of our predictions. This is where evaluation metrics come in. Although the evaluation part comes later, we need to define these metrics for our hyperparameter optimization step and since the Mean Absolute Percentage Error (one of our evaluation metrics) isn't a part of sklearn's metrics package, we will have to create a function for it. 
def mean_absolute_percentage_error(y_valid, preds):
    '''
    Calculate the mean absolute percentage error as a metric for evaluation
﻿
    '''    
    y_valid, preds = np.array(y_valid), np.array(preds)
    return np.mean(np.abs((y_valid - preds) / y_valid)) * 100 
﻿
Step 5: Model DevelopmentThen we define the function that will train our model using the bst_params parameters. You might notice the run.config within our bst_params dictionary, that's used to perform hyperparameter sweeps using the sweep_config and wandb.sweep function below. We've also used our evaluation metrics and logged RMSE and MAPE metrics.
def train():
    with wandb.init() as run:
        bst_params = {
            'objective': 'reg:squarederror', 
            'n_estimators': 1000,
            'booster': run.config.booster,
            'learning_rate': run.config.learning_rate,     
            'gamma': run.config.gamma,
            'max_depth': run.config.max_depth,
            'min_child_weight': run.config.min_child_weight,  
            'eval_metric': ['rmse'],
            'tree_method': 'gpu_hist',
        }
        # Initialize the XGBoostRegressor
        xgbmodel = xgb.XGBRegressor(**bst_params)
﻿
        # Train the model, using the wandb_callback for logging
        xgbmodel.fit(X_train, y_train, 
                     eval_set=[(X_valid, y_valid)],
                     callbacks=[
                         WandbCallback(log_model=True,
                                       log_feature_importance=True,
                                       define_metric=True) 
                     ],
                     verbose=False)
        
        preds = xgbmodel.predict(X_valid)
        mse = mean_squared_error(y_valid, preds)
        mae = mean_absolute_error(y_valid, preds)
        rmse = np.sqrt(mean_squared_error(y_valid, preds))
        r2 = r2_score(y_valid, preds)
        mape = mean_absolute_percentage_error(y_valid, preds)
        print("RMSE: %f" % (rmse))
        print("MAPE: %f" % (mape))
        wandb.log({"Valid_RMSE": rmse})
        wandb.log({"Valid_MAPE": mape})
Step 5: Hyperparameter optimization using W&B SweepsNow we define our sweep_config where we define the method of hyperparameter sweep ( grid or random or bayes) and the range for our hyperparameter search.
# Define the sweep
sweep_config = {
  "name" : "Time_Series",
  "method" : "random",
  "parameters" : {
    "booster": {
        "values": ["gbtree", "gblinear"]
    },
    "learning_rate": {
      "min": 0.001,
      "max": 1.0
    },
    "gamma": {
      "min": 0.001,
      "max": 1.0
    },
    "max_depth": {
        "values": [3, 5, 7]
    },
    "min_child_weight": {
      "min": 1,
      "max": 150
    },
    "early_stopping_rounds": {
      "values" : [10, 20, 30, 40]
    },
  }
}
﻿
sweep_id = wandb.sweep(sweep_config, project='Time_Series')
Now, run the sweeps agent to start our hyperparameter sweeps. 
count = 50 # number of runs to execute
wandb.agent(sweep_id, project='Time_Series', function=train, count=count)
Once the sweeps is done, we can go to our sweeps dashboard page and select the best sweep (best model performance). We can make use of W&B custom charts to find our best sweeps run. For example, I've used an RMSE vs MAPE graph to find the splendid-sweep-12 that has least RMSE and MAPE.
﻿
﻿
We can also add a feature importance table using W&B custom charts. We can see that the day-of-year feature has the highest feature importance in our model, this may be due to the fact day-of-year can better capture the seasonality of our data.
﻿
﻿
﻿
Step 7: Model ValidationNow that we have our best performing sweep (among 50 sweeps), let's see how it performs against our held-out dataset i.e., valid_data.  
Using Artifact, we can download the JSON file of our model.
run = wandb.init()
artifact = run.use_artifact('madhana/Time_Series/698vdxq9_model.json:v0', type='model')
artifact_dir = artifact.download()
wandb.finish()
Let's initialize an XGBRegressor and load our downloaded model. 
model_xgb_2 = xgb.XGBRegressor()
model_xgb_2.load_model("/content/artifacts/698vdxq9_model.json:v0/698vdxq9_model.json")
Using the downloaded model, let's make a forecast with our validation data. Plot the results and log them using wandb.log({"chart":plot}) and log the evaluation metrics.
# Init wandb run
run = wandb.init(project="Time_Series")
﻿
# Predict and plot using the loaded xgboost model
valid_df['Prediction'] = model_xgb_2.predict(X_valid)
valid_preds = valid_df['Prediction']
data_all = pd.concat([valid_df, train_df], sort=False)
plot = data_all[['Close','Prediction']].plot(figsize=(15, 5))
﻿
#Log Plot
wandb.log({"chart":plot})
﻿
# Model Evaluation Metrics, Log em too!
mse = mean_squared_error(y_valid, valid_preds)
mae = mean_absolute_error(y_valid, valid_preds)
rmse = np.sqrt(mean_squared_error(y_valid, valid_preds))
r2 = r2_score(y_valid, valid_preds)
mape = mean_absolute_percentage_error(y_valid, valid_preds)
print("RMSE: %f" % (rmse))
print("MAPE: %f" % (mape))
wandb.log({"XGB_Valid_RMSE": rmse})
wandb.log({"XGB_Valid_MAPE": mape})
﻿
# Stop the run
wandb.finish()
﻿
﻿
﻿
As we can see in the previous chart, the predicted value of the closing price and the actual value are quite off. I'm pretty sure that we can make better predictions as we fine-tune our pipeline by trying out different methods at each step. 
For example, try using tsfresh library to extract more features from our data and use Lag and Rolling features to see better results when using a supervised learning approach. You can try using deep learning models such as LSTM, GRU which can handle long-term temporal dependencies  present in the data. The data preprocessing steps for these models will be different and you can learn how to use these models in this Colab and this Machine Learning Mastery blog. 
There are various other advanced methods such as Prophet, Neural-Prophet, Pytorch Forecasting which provides high level APIs for easier implementation of  models such as TemporalFusionTransformer, NBeats and various other AutoML tools like DARTS, H2O, TPOT, etc., which can possibly produce better results. We must note that, no one can predict the future with 100 percent accuracy, and all predictions are uncertain.
SummaryIn this blog, we explored what time series is and what differentiates it from static data, how to split train-test splits for time series, how to check for stationarity, single-step vs multi-step, and univariate vs multivariate and finally we implemented a BTC price forecast with Python and Weights and Biases for experiment tracking. We now know how to use W&B for not just time series forecasts but for any machine learning workflows!
Image from W&B docs﻿
﻿﻿﻿
We hope that this article has been helpful for learners in understanding various methods and techniques used in time series forecasting. Also, if you feel that you didn't quite get any section in this report, feel free to reach out in the comments section. We'll be more than happy to help you. Happy learning!
References and Recommended Reads﻿Forecasting: principles and practice by Rob J Hyndman and George Athanasopoulos
﻿Deep Learning for Time Series Forecasting by Jason Brownlee
﻿Applied Time Series Analysis by A.A. Tsay
﻿
Add a comment
Tags: Articles, Domain Agnostic, Tutorial, Experiment
Iterate on AI agents and models faster. Try Weights & Biases today.