Data visualization

A journey into Data visualization



Innovation deployment Dashboard
Figure 1. Innovation deployment of a car factory.

What is data visualization

Files store values. Values are our data. A collection of data is information. Reading raw data stored in a file is possible. But interpreting raw data is impossible by reading it only. That is why we visualize them in an interpretable way. “The greatest value of a picture is when it forces us to notice what we never expected to see.” -John Tukey Today, we will drill down into Data visualization.

How did it started

From a historical perspective, we can say data visualization existed since a long time. In fact [1] in 1644, Michael Florent Van Langren, a Flemish astronomer, is believed to have provided the first visual representation of statistical data used for estimating distances between cities. From a cognitive perspective, data visualization extends its roots to the 20th century. [2] Max Wertheimer (1880–1943), Kurt Koffka (1886–1941), and Wolfgang Köhler (1887–1967) founded Gestalt psychology in the early 20th century. Gestalt psychology is a field in psychology emphasizing that organisms ( we, humans !) perceive things in form of patterns or configurations rather than individual components. In other words, we tend to “group by” things when we visualize them. We might check out Gestalt principles for further information about this interesting subject. The perspectives mentioned above explains us that visualization seems to “ease” our understanding of data and what is going on in an scenario that implies data . Let us see how we stand on this principle in visualizing industrial data.

Data visualization in Industries

As we might know, data is ubiquitous in the use of Data visualization in Industries. We have different kind of data files storing usually voluminous data for different purposes.

Purpose

In data visualization, we have one purpose : Taking decisions based on our visualized data.

Let’s set things up !

It is crucial to define very well and understand the bases of Data visualization. We need to know what to visualize and why. Then we need to prepare our data. Thus if we skip or don’t invest enough effort on this point, our visualization might become meaningless and hard to interpret. Therefore it prevents to reach our purpose mentioned above.


“To find signals in data, we must learn to reduce the noise — not just the noise that resides in the data, but also the noise that resides in us. It is nearly impossible for noisy minds to perceive anything but noise in data.” — Stephen Few


In order to to visualize, we need to understand what content and context is.

Content

Content is the quantity of something we want to measure. For example the energy produced by a battery.

Context

Content without context isn’t meaningful. We might have an in-depth look at what context in Industrial data visualization is. For example, the type of car we produce is our context.


Content and context are represented thanks to columns and rows in a data source. Rows and columns

Dashboard data source display example
Figure 2. Columns of a data source

Data in our data files are labeled with columns names in Dashboards. Even for data contained in different sources like JSON objects, we can represent them as columns and rows.

In fact, columns can represent context or content. Once again, content is quantity whereas context is used to categorize quantity.

In our example above, we have a content which could be the building type. and our content UnitPrice.

Visualizing data

As we might guess, Dashboards are what comes first in mind when we talk about visualizing industrial data. So what is a Dashboard ?

[3]“A visual display of the most important information needed to achieve one or more objectives that has been consolidated on a single computer screen so it can be monitored and understood at a glance

Lets stick on the definition above. Moreover, a Dashboard is a set of widgets.

So what are widgets ?

In dashboards we use widgets

Widgets are a “parts” of a Dashboard containing a tool visualizing data. For example a Bar chart is a tool used in a widget.

Etymology of widget

It is known coming from 1930’s English word gadget which is a small ingenious mechanical or electronic device or tool.

Why widget (s) in a Dashboard ?

Presenting multiple widgets, emphasizes the message we want to deliver to our audience. Indeed, only one widget might not be enough to take our decision. Also, every widget can highlight a specific behavior of our system.

Their importance ( Big Data)

Widgets are essential providing us a global vision on the “behavior” of data stored in a data source. This is life saving for our industrial sector manipulating Big data.

How to prepare Widgets

In order to prepare a widget using a tool to display data, we need data !

So, how to prepare our data for widgets ?

We need data sources

Join tables in a data source
Figure 3. Joining two tables in a data source to form a unique representation of data

In dashboarding, data source is a representation of data received from tables belonging to data files accessed via Connections. Mainly, we join data from these different Connections to obtain a main table that will be used by our widget. < article sur data source ? >

In figure 3, we are joining two tables in a data source to form a unique visualization of the data extracted.

A bad use of table joining
Figure 4. Result after joining the two tables.

When joining our context, we need to choose them wisely, otherwise we might create meaningless data sources as shown in Figure 4. We can discuss what is joining columns in a new article < article sur les joins ? >


An important fact is that a widget uses only one data source but a data source can be used by different widgets. So how do we prepare our connections ?

We need Connections

data files for dashboarding
Figure 5. Connection types ( non-exhaustive )

In figure 5. we have a list of possible Connections we can use to import data.

How to prepare our Connections

In order to prepare our connections, we need to import them to our Dashboard design software and use them to create data sources.

< article detaillé sur les connections ? >

So, which widget for which scenario ?

Here we are, at one of the most delicate point of our topic. How do we chose our tools based on our data and our goal ?

We should note that our goal is to communicate a message to our audience. A way of doing this effectively is [4] choosing an effective visual. To have an insight on what context is, we might have a look to Context in Industrial data.

Concerning our choice of widgets, we have our recommendation below:

Figure 6. Decision flowchart for visualization component

Data visualization examples

We could check out briefly some widget examples and analyze their productivity.

Good examples with valuable outcome

Sankey diagram
Figure 7. Widget illustrating heat and power distribution in a factory.

In our figure above, colors are well chosen, labels are properly named and our layout is clear. Our data is properly visualized and catches our attention directly to heat which is a context. Assuming a person knows what a Sankey visualization is, the highlighted message is : “Our company uses mostly Natural gas to convert it into heat which supplies mostly our main building.”

Bad examples with unproductive outcome

Sankey diagram and bar chart
Figure 8. Two widgets using Sankey and bar chart.

In figure 7, we have some data illustrated with our widgets. Our colors are clear. Yet compared to Figure 6, even if we have more widgets, the message is unclear. Our Sankey diagrams flows are too similar to each other in terms of flow widths and in both widgets, our context is unknown, apparently our data had a poor naming from the beginning of our visualization process ! In our bar chart, our X and Y Axis are unnamed and our widgets have poor titles. The only positive part in this example, is that our bar chart demonstrates an increasing content based on the chosen context. But as our context is unknown, it remains meaningless.

To summarize, we should pay attention to every point of our widgets.

“There is no such thing as information overload. There is only bad design.” — Edward Tufte

Coming to an end

Data visualization involves multiple fields from science to psychology and serious research is being conducted on this topic.

We need to prepare our data before visualizing it.

As for our final words, we can say that data visualization is essential in Industries to understand our industrial systems and how to improve them.

References

[1] Few, Stephen. Information Dashboard Design : The Effective Visual Communication of Data. Sebastopol, Ca, O’reilly, 2006.

[2] Sternberg, Robert J.; Sternberg, Karin (2012). Cognitive Psychology (6th ed.). Belmont, Calif.: Cengage Learning. pp. 113–116. ISBN 978–1–133–31391–5.

[3] Few, Stephen. Information Dashboard Design : The Effective Visual Communication of Data. Sebastopol, Ca, O’reilly, 2006.

[4] Cole Nussbaumer Knaflic. Storytelling with Data : A Data Visualization Guide for Business Professionals. Hoboken, New Jersey, Wiley, 2015.












Posts Récents
Archives
Rechercher par Tags