26.03.2019, Lesezeit: ~4min
The visualization of data is like painting a meaningful picture, whose colors consist of a well-thought-out mixture of zeros and ones. Not even "dried", this picture usually provides a light bulb moment for the viewer. And that is exactly our goal!
We have already shown you in detail the theoretical and analytical considerations behind each of these visual presentations using a project example in our first two articles of the blog series "Customer behavior predictions". In this article, we would like to give you an insight into visualisation, using a versatile and comprehensive tool: the visualisation software Tableau.
In our example of the food online webshop, we used the method of logistic regression to calculate the probability of a customer categorically choosing "meat" or preferring "vegan" because of their buying behaviour. No sooner said than done! The classification was on the table due to the features gained. The people responsible for the online webshop presence now only had one wish: to present this insight to the Management Board in a meaningful way!
At this point, a visualisation should be considered, which should clearly and simply present this scientific data. But: Which visualisation was the right one? How should we package the findings so that they lead to a real light bulb moment in the executive floor?
As in many other areas, also visualisation leaves one spoilt for choice, which can best be countered with posing a few questions:
As soon as we had answered to these questions, the options became clearer in our project example. For us, there was only one diagram that could meet our needs: a Sankey diagram!
With a Sankey diagram we were able to show what we wanted to show: on the one hand, the characteristics we had gained that led to a categorisation of "meat" or "vegan" and, on the other hand, their connection with the respective classification.
What can this diagram do? The Sankey diagram can be created in Tableau with a certain amount of extra effort and in fact provides very meaningful visualisations. Streams of any kind can be visualised, where the width of each stream depends on its quantity. The diagram provides a complete overview of the system to be visualised from a bird's eye view. Furthermore, it allows the relationship between certain features and categorisations to be presented simultaneously to the viewer.
Describing the steps we had to take in our example to achieve this significance would go beyond the scope of this article. What we do not want to withhold from you, however, are the most important signposts that ultimately led us to the "finished picture".
It is essential to know that our basic data consisted of the already derived dataset, which we already know from the second post of our blog series. Polygonal data was a wonderful addition and extension. In order to obtain beautiful and meaningful curves, we created a so-called sigmoid (or gooseneck) based on the field to be calculated.
The size, the order, the start and the end of the sigmoid streams gave us some table calculations, which we could finally combine with all other components in a dashboard. And the picture in the form of the Sankey Diagram was finished!
In our case, the light bulb moment could be: The purchase of potatoes thus represents in 65.20% of the cases an indicator for buyers who are to be found predominantly in the category "meat".
We hope that we were able to give a small insight into the world of visualised data with our explanations.
Look forward to the next article in our blog series, in which the BI marketing team will present its view on this project and focus on one essential factor: the benefit for the customer!