I have some dataframes I want to plot with a variable number of Fruits. When plotting, each dataframe will always have the same Fruit types, even though some may have a value of 0 in their respective row.
df1
Day Value Fruit
1 5 Banana
1 3 Pear
2 4 Banana
2 2 Pear
3 3 Banana
3 3 Pear
df2 # note all pears 0'd
Day Value Fruit
1 5 Banana
1 0 Pear
2 4 Banana
2 0 Pear
3 3 Banana
3 0 Pear
# AKA
df1.Fruit.unique() == df2.Fruit.unique()
Plotting code:
import dash_bootstrap_components as dbc
from plotly import express as px
bar_charts = []
for df in dfs:
# to get a stacked bar chart
fruit_data = df.groupby(["Day", "Fruit"]).Value.sum().to_frame().reset_index()
bar_charts.append(
dcc.Graph(
figure=px.bar(
fruit_data,
x="Day",
y="Value",
color="Fruit",
),
)
)
return dbc.Col(bar_charts)
Plotly seems to be assigning differnt colors to the Fruit between dataframes based on the value of the Fruit. I want the Fruits to always be the same color across graphs regardless of value (e.g. Banana always yellow, Pear always green). HOWEVER, I don't know what Fruits are available until runtime, so I can't just hard code a color map ahead of time.
How do I tell plotly to always color the Fruit the same?
Im sure this is some silly little option im missing.
CodePudding user response:
To make sure that the same colors are assigned to the same Fruit
, you can make a color map using fruit_data['Fruit'].unique()
and any list of colors like ['yellow', 'green']
or a longer list px.colors.qualitative.Alphabet
for color_discrete_map
in px.bar
like this:
colordict = {f:px.colors.qualitative.Alphabet[i] for i, f in enumerate(fruit_data.Fruit.unique())}
colordict = {f:['yellow', 'green'][i] for i, f in enumerate(fruit_data.Fruit.unique())}
Plot:
And if you're working with a large number of variables and worry about running out of colors, just incorporate the approach described in Plotly: How to increase the number of colors to assure unique colors for all lines like this:
n_colors = len(fruit_data.Fruit.unique())
colorscale = colors = px.colors.sample_colorscale("viridis", [n/(n_colors -1) for n in range(n_colors)])
colordict = {f:colorscale[i] for i, f in enumerate(fruit_data.Fruit.unique())}
Complete code:
import io
import plotly.express as px
import pandas as pd
# data
fruit_data = pd.read_csv(io.StringIO("""Day Value Fruit
1 5 Banana
1 3 Pear
2 4 Banana
2 2 Pear
3 3 Banana
3 3 Pear"""),sep= '\\s ')
# color assignment:
# colordict = {f:px.colors.qualitative.Alphabet[i] for i, f in enumerate(fruit_data.Fruit.unique())}
colordict = {f:['yellow', 'green'][i] for i, f in enumerate(fruit_data.Fruit.unique())}
# color assignment for a large number of variables:
n_colors = len(fruit_data.Fruit.unique())
colorscale = colors = px.colors.sample_colorscale("viridis", [n/(n_colors -1) for n in range(n_colors)])
colordict = {f:colorscale[i] for i, f in enumerate(fruit_data.Fruit.unique())}
# plotly figure
fig = px.bar(
fruit_data,
x="Day",
y="Value",
color = 'Fruit',
color_discrete_map = colordict
)
fig.show()