Introduction:
Thanks to the functionality of predictive planning, specifically time series, which has been consolidated in SAP Analytics Cloud for several years now, many organizations rely on it to streamline their planning processes. This involves generating estimated scenarios quickly, based on historical data and potential influencing factors to improve the accuracy of these scenarios.
In the 2024.Q1 update of SAP Analytics Cloud, a new functionality was introduced to allow users perform individual analysis of projected values, and how these values are being composed. This means being able to identify the weight of historical data, trends, and influencers within the sample. The purpose of this article is to explore this latter.
Scenario:
In this exercise, the idea is to estimate the daily number of bicycle trips across one important corridor in the city of Montreal for the year 2023 (January to December). For this scenario, two sets of historical data from 2018 to 2022 will be combined. The first set refers to the daily bicycle trips done through the Lajeunesse-Berri-Saint-Denis corridor (part of the Montreal’s Express Bike Network), this data is collected daily by a bike counter located in the intersection Saint-Laurent/Bellechasse. The second set consists of historical data on maximum and minimum daily temperatures in degrees Celsius, rainfall precipitation and snowfall amount in millimeters. Additionally, for this second dataset, estimated data for 2023 is available, which will be used as influencers to enhance the simulation. Once the forecast it’s ready, the idea is to zoom in and analyse an individual forecasted value composition.
Modeling:
To handle the datasets, a simple model named Bikes was created consisting of 2 dimensions and 5 measures:
Dimensions:
Date: Daily granularity
Version : Actual and Forecast
Measures:
PRCP : Rain Precipitation (mm)
SNOW : Snow on the ground (mm)
T_MIN : Minimum Temperature (°C)
T_MAX : Maximum Temperature (°C)
BIKES : Quantity of Bikes per day
Predictive Scenario:
Assuming that the datasets have already been loaded into the model, it’s time to proceed to configure the predictive scenario (time series type ):
To achieve the scenario’s objective, measure BIKES (Total Bikes) will be used, the Date dimension with a daily granularity level, and 365 forecast periods:
To be able to see the impact of the influencers, the model training should be performed with an observation window of 1 year until the last observation. Also, forecasted negative values will be converted to zero (since there’s no logic in having negative bike counts):
Finally, this scenario will use some influencers as Temperature (max and min), rain precipitations and snow on the ground:
Initial Result:
After the execution of Train and Forecast function, the results look like this:
Initially, the observed result is consistent with the trend of the previous year, with significant day-to-day variations but in general consistent with the historical data used. This is where is possible to analyze value by value, and how each one is composed, all this thanks to the new functionality released in 2024 Q1. Let’s take, for example, January 20, 2023:
As we can see, the total value suggested by the system for that day is 1,086 trips, where the historical trend suggests a value of 2,604 trips . However, this forecast value is strongly countered by the 4 influencers included for the forecast. The amount of snow for that day will be 22mm on the ground, resulting in a decrease of 734 trips. The maximum temperature for that day (-1.4 °C) is the secondary factor responsible for projecting fewer trips, combined with minimum temperature at the same time (-6. 6 °C) also influences the result. Finally, in fourth place, we see how some forecasted rainfall precipitation for that day (6.6mm of water) also have a decreasing impact, although not very significant.
In a more general view, we can also observe how the influencers act on a percentage basis on the final projection result:
This is important for identifying, through multiple simulations with different variables, which of them are more relevant for the scenario that the business wants to forecast and the results they want to achieve.
Conclusion:
This is an important feature and improvement for the predictive scenarios and predictive planning, since now organizations will be able to run multiple scenarios with different influencers and analyze how the data reacts to the different parametrizations and understand more clearly the weights and correlation of each variable in the result of each simulation, allowing them to know better how to run a forecast and take better-informed decisions .
Authors : Andres Romero, Camille Chenon