How to use Azure AutoML for Timeseries Forecasting with example

Introduction

Jason Xiao
5 min readMar 17, 2021

Timeseries data is among one of the most common types of data collected in modern world. The usages of forecasting timeseries has thus become one of the most common and most difficult questions to solve. With the ability to forecast timeseries, we can apply it in fields of financial technology, predicting stock, equity prices with respect to time and other features; we can also apply it in fields of manufacturing and e-commerce, where supply chain optimization comes into play. With the ability to forecast sales and demand for product, the producer or the seller is more capable of allocating resources and supply effectively at the same time ensure customer satisfaction rate and low operational expenses. In example below, if we can forecast the sales at each FDC (front distribution center), then we’ll be able to estimate the consumption from the RDCs (regional distribution center) and etc, ultimately better solve the distribution optimization problem.

Example of a Supply Chain Network

Advantages of using Azure AutoML in the Timeseries Forecasting process

Since timeseries forecasting has been such a common problem among nowadays statisticians, computer scientists and etc, what is the advantage in using Azure AutoML in timeseries forecasting? The two most prominently important thing about the AutoML process here is model selection and explainability.

Azure AutoML can run hundreds of different machine learning models at the same time, score and compare them to one another to determine the best type of model to use for the situation. As an example, had you wondered which model can predict the demand for a type of chocolate, but you don’t know what the best model to use is, what could you do? You might gather and try a few models common to most, ARIMA, Ridge Regression etc. However, if you used AutoML, you would be able to see the best performing models right off the bat and try them from top to bottom with your own fine tuning.

Explainability is another advantage of Azure AutoML, giving you the capability to see the importance per feature, what weight each model decided to give each feature as well as the components of models are transparent. This allows users to better engineer their features for better model performance.

How-to’s and Tutorials

Data Processing:

In most timeseries problems, it is a good idea to generate a lag within your data so that your machine learning model will train on input features with the correct output values. So how should you determine the lag to your problem. The first part is understanding your forecasting window, or to similar effect, ask yourself the question, how far into the future are you trying to see with the information you have in hand. If we were always trying to predict a month’s values ahead of time, then we should consider generating a month-long lag in the data. eg. December’s inputs matched up with January’s outputs. To use Azure AutoML, you will also have to make sure the data you inputted into the AutoML service is clean. Remove weird symbols and null values.

Two ways to use Azure AutoML:

  1. From the Azure Portal:
Open up Azure ML Studio in the Portal and create a new Datastore, upload your data used for training here.
Then make a new AutoML run.
From there you can link up your datastore with your AutoML run.
Configure your run after selecting your dataset, target column being the values you want to forecast. You will also need compute power, this example uses an Azure Virtual Machine.
Then you want to tell the run what task to run, in this case time series forecasting, make sure to input the column that indicates time.
It’s smart to also click additional configurations and select explain best model to use the built in explainability features of Azure AutoML.
Then you can let it run, and soon, and output like this will be seen. All the models are juxtaposed against each other and compared using the metrics selected above (normalized RMSE). You can select the model you want to use and deploy when needed.
This is an example of what it could look like if you click “View explanation”, this shows prediction of data.
You can also examine feature importance if you’d like. This can help engineers better tune their models.

2. Azure Machine Learning SDK

Check github link here:

3. Using a deployed model’s REST endpoint:

Get the endpoint from the Azure ML Studio -> Endpoints -> “Your model”

After retrieving endpoint, you can use either the requests package in python or Postman, POST to the endpoint with your data structured as a json. An example of the raw input is provided below.

Structure for json data input when posting to model REST endpoint

Completed example of Azure AutoML Applied on Stock Price Timeseries Forecasting

Overall Architecture:

Architectural Diagram

We are going to use AlphaVantage on a virtual machine to pull live stock price data, store it in datastore (blob storage) and then run it through Azure ML Studio. Ultimately, we’ll generate some visualizations.

A github link will be provided with the code associated with the project, it has generally everything completed, however, you will need your own Azure subscription and etc.

The code will go through these following steps:

  1. Mine and preprocess stock price data using AlphaVantage.
  2. Upload data to blob storage as a CSV file.
  3. In Azure Portal, set up AutoML by linking up blob storage as a data store.
  4. In Azure Portal, starting time series experiment, evaluate and deploy models.
  5. Test/visualize models using test set data.

--

--