Thursday, July 15, 2021

How Sankey Diagram Used In Data Visualization

Sankey diagrams are a type of flow diagram that highlights the flows between the variables in a system. These flows could be materials, costs, energy, or more, based on your requirements. By displaying the flows and their quantities that are proportional to each other, Sankeys help explain the relationship between the categories. They direct the attention of the viewers towards the most important aspect of the system and make insightful decisions.

In this article, we’ll discuss what a Sankey diagram is and how it is used in data visualization. So, let’s get started!

What is a Sankey Diagram?

A Sankey Diagram is a method of data visualization that shows the flows of energy, money, or materials in a system. The thickness of the arrows or lines is proportional to the quantity of flow. Thus, the bigger is the arrow; the larger is the flow quantity.

The arrows or the lines can either combine or split through the path on all the stages of a process. In order to divide the diagram into different categories or illustrate the transition from one phase of the process to the other, you can use colors.

Sankey Diagram For Data Visualization

Sankey diagram is a commonly used data visualization tool, providing you with an overview of the flows involved in a system, like materials, energy, in advertising for visualizing customers’ journey, etc. As each flow has different width, data analysts can easily determine the most important areas of focus.

This diagram was first drawn by an Irish engineer, Matthew Sankey to depict the energy flow of a steam engine. However, today it is used as an important data visualization tool with large applications. People use Sankey diagrams across several industries, not limited to the following-


In finance, Sankey diagrams are used to track cash flows, finances, and help keep track of the inflow or outflow of the expenses.


Healthcare facilities use Sankey diagrams to visualize the patients’ journey, from consultation and surgery to emergency situations.


In industries, Sankey diagrams are used to view the input of materials, energy, cost, and understand how these elements relate to each other.


Modern marketers use Sankey diagrams as a useful tool for advertising and data analysis. Data analytics play an important role in marketing success. However, just collecting data on customers’ behavior isn’t enough. You also need to analyze and study it properly to generate actionable results.

With a Sankey flow diagram, marketers can conduct analysis much easier, helping them make educated decisions. Below are some of the benefits of Sankeys for PPC advertisers-

You can draw Sankey Chart in Excel & Google Sheets with minimum requirements, i.e., a metric and two dimensions. Thus, you don’t have to gather a lot of data.

Sankey diagrams show clearly how the values shift from one category to the other. In other words, you get to know where your traffic is coming from.

With Sankey charts, you can see user flows from one buying phase to the other, providing you with a great overview of how your Ads perform.

You can easily indicate this data visualization form with Google Ads.

Using Sankey Diagram In Data Visualization

There are two main aspects of using a Sankey diagram, i.e., Link and Node. Nodes are like bars in a bar chart whose height denotes the value of the flow. And, the connection between the nodes (variables) is known as a link. Links in Sankey diagrams indicate the flow.

The flow is represented between at least two nodes, and the flow indicates the transfer of energy, costs, materials, or other measurable metrics. The lines or arrows used to link the nodes have a width proportional to the quantity of the flow. Along with the flow values, Sankey diagrams also tell you about the distribution of the system.

However, in order to visualize data via Sankey diagrams, first you need to draw it.

How To Draw A Sankey Diagram?

In order to draw a Sankey diagram, you need multi-categorical data. The variables in the dataset are the nodes in the diagram. You can start creating a Sankey diagram through a number of ways, like online applications, R packages, JavaScript libraries, etc. Thus, the way you prepare data for Sankey diagram depends on the tool you use to draw it. All you need is a dataset, a target field, and metric you want to measure.

One online application you can use to create a Sankey diagram is Google Charts. You can create a variety of graphics using Google Charts, including Sankey. You derive Google Charts’ layout D3.js Sankey layout. Generally, Google Charts are used for embedding data visualization on a website.

RawGraphs is another online tool for drawing Sankey diagrams. It offers a drag and drop interface to those who aren’t familiar with other coding software. As with Google Charts, you can also choose to draw a variety of graphics in RawGraphs, including Sankey diagram.

Draw A Sankey Diagram In Excel

Besides the online applications, you can also use Microsoft Excel to create a Sankey diagram. Below are the steps to do so-

  • Install an add-in software, like Power User. Head to the File option in Excel, click Excel Options, go to Add-Ins, and enable the Power User
  • Paste the data you want to turn into a Sankey chart.
  • Go to the Power User tab and tap Create Sankey Chart option.
  • Select the data in your worksheet and press ‘OK.’ As you press ‘OK,’ your Sankey chart is created.


In this article, we discussed about what a Sankey diagram is, and how it is used in data visualization. It is an incredibly useful way to visualize complex data and turn it into something which is easy to understand. Regardless of the type of the industry, a Sankey diagram is used to understand the relation between the key components and identify any inconsistencies. All you have to do is to pick up the right data visualization tool, choose the dataset, and you can start creating Sankey diagrams.



No comments:

Post a Comment

How Sankey Diagram Used In Data Visualization

Sankey diagrams are a type of flow diagram that highlights the flows between the variables in a system. These flows could be materials, cost...