Skip to main content

Sankey tutorial

The visuals allows the creation of Sankey charts. This is creating a sample chart and preparing data in Power BI Desktop for rendering Sankey chart by PlotlyJS Visual.

Snakey chart view

In this tutorial SankeyMicrosoftSample report was used for data imported from Microsoft Sankey visual sample report.

Prepare data

To set node labels on the chart correctly, you have to transform your table to the next state:

SourceDestinationValueSourceIndexDestinationIndexNodeLabel
AngolaChina5017Angola
AngolaIndia1018Brazil
AngolaJapan3019Canada
BrazilEngland113England
BrazilFrance115England South
BrazilPortugal5110France
BrazilSpain1114France South
CanadaEngland123Mali
CanadaFrance525Mexico
CanadaPortugal1210Morocco
EnglandAngola130Portugal
EnglandMorocco239Portugal South
EnglandSenegal1312Senegal
England SouthSouth Africa5413South Africa
FranceAngola150Spain
FranceMali357Spain South
FranceMorocco359USA
FranceSenegal3512China
France SouthSouth Africa2613India
MaliChina5717Japan
MaliIndia1718France1
MaliJapan3719Africa China
MexicoEngland183Africa India
MexicoFrance185Africa Japan
MexicoPortugal1810
MexicoSpain5814
MoroccoChina5917
MoroccoFrance111920
MoroccoIndia1918
MoroccoJapan3919
PortugalAngola2100
PortugalMorocco1109
PortugalSenegal11012
Portugal SouthSouth Africa41113
SenegalChina51217
SenegalIndia11218
SenegalJapan31219
South AfricaAfrica China11321
South AfricaAfrica India21322
South AfricaAfrica Japan31323
SpainMorocco3149
SpainSenegal11412
Spain SouthSouth Africa41513
USAEngland5163
USAFrance1165
USAPortugal11610
USASpain11614

There is 'NodeLabel' column with labels for a node with unique values on top (it’s crucial to build Sankey properly).

We need to add an Index column to the source table 'SankeyMicrosoftSample'; the index will use as a foreign key to join with the source column.

To add node names, we need to apply a query to 'SankeyMicrosoftSample'. Let’s overview each row step by step to understand it.

let
// We use “SankeyMicrosoftSample” table as a source
Source = SankeyMicrosoftSample,
// Then we create two tables, one for source nodes, and another for destination nodes
// Remove all columns except the “Source” column
DestinationNodes = Table.RemoveColumns(Source,{"Source", "Value", "NodeLink"}),
// Then rename it to the “Nodes” column.
DestinationNodesRanmed = Table.RenameColumns(DestinationNodes, {{"Destination", "Nodes"}}),

// Do the same for Destination nodes.
// Remove all columns except the “Destination” columns.
SourceNodes = Table.RemoveColumns(Source,{"Destination", "Value", "NodeLink"}),
// Then also, rename it to the “Nodes” column.
SourceNodesRenamed = Table.RenameColumns(SourceNodes,{{"Source", "Nodes"}}),

// Because we need only one column with all node labels, combine those two tables with “Node” columns
Combination = Table.Combine({SourceNodesRenamed, DestinationNodesRanmed}),
// Then get distinct values by applying Table.Distinct function
DistinctNodes = Table.Distinct(Combination),
// To join with the source column,
// Add an Index column to our Table with distinct “Nodes” table
NodesList = Table.AddIndexColumn(DistinctNodes, "Index", 0, 1, Int64.Type),

// The last step is to join two tables (SankeyMicrosoftSample, NodesList)
LeftJoin = Table.AddJoinColumn(SankeyMicrosoftSample, "NodeLink", NodesList, "Index", "NodeLabel"),
// Expand the NodeLabel column. It doesn’t need an index column from the NodesList table.
Expand = Table.ExpandTableColumn(LeftJoin, "NodeLabel", {"Nodes"}, {"NodeLabel"}),

// Add IDs for source and destination node because PlotlyJS expects numbers instead of string
// Add SourceID
SourceID = Table.AddJoinColumn(Expand, "Source", NodesList, "Nodes", "SourceID"),
// Expand the table to get SourceID column only
ExpandedSourceID = Table.ExpandTableColumn(SourceID, "SourceID", {"Index"}, {"SourceID"}),

// Add DestinationID
DestinationID = Table.AddJoinColumn(ExpandedSourceID, "Destination", NodesList, "Nodes", "DestinationID"),
// Expand the table to get DestinationID column only
ExpandedDestinationID = Table.ExpandTableColumn(DestinationID, "DestinationID", {"Index"}, {"DestinationID"})

// Return result
in
ExpandedDestinationID

Configure Sankey chart in PlotlyJS chart

When data is ready, create the visual instance on a report page and assign NodeLink, NodeLabel, SourceID, DestinationID, and Value columns to the visual instance:

PlotlyJS Visual with data columns for Sankey

Enter edit mode by clicking on Edit in the visual menu and click +Trace:

Add trace

Then switch trace type Sankey by selecting trace on the list.

The last step is assigning columns to properties:

NodeLabel to Labels

SourceID to Sources

DestinationID to Target

Value to Values

and save the chart in the Save/Load tab:

Save/Load chart

The final result is:

The final result of Sankey chart