Home > Enterprise >  Stacked data modification (matplotlib)
Stacked data modification (matplotlib)

Time:04-19

I have a dataset and want to visualize a horizontal stacked bar.

The problem is, each data columns of Promotors, Neutrals, and detractors, corresponding to each year (e.g., first, second, and so on) is meant to be 100 (e.g., sum of 21, 46.5, and 32.5 should be 100). However, my visualization result shows that it does not stack to 100.

Any advices? Thanks!

from matplotlib import pyplot as plt

plt.rcParams["figure.figsize"] = [10, 8]
       
year = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth']
promoters = [21, 20.8, 21.8,27,24,20.5]
neutrals = [46.5, 56.0, 54.3,47.8,50.0,52.5]
detractors = [32.5, 23.3, 24.0,25.3,26.0,27.0]

b1 = plt.barh(year, promoters, color="darkseagreen")
b2 = plt.barh(year, neutrals, left=promoters, color="lightyellow")
b3 = plt.barh(year, detractors, left=neutrals, color="coral")


plt.legend([b1, b2, b3], ["promoters", "neutrals", "detractors"], loc="upper right")
plt.xlim([0, 100])

plt.show

failed version of stacked data

CodePudding user response:

sorry, I am not good at English, So I can't really explain it, but I'll put my code here, hopefully you can understand it.

from matplotlib import pyplot as plt
import numpy as np


plt.rcParams["figure.figsize"] = [10, 8]

year = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth']
promoters = [21, 20.8, 21.8, 27, 24, 20.5]
neutrals = [46.5, 56.0, 54.3, 47.8, 50.0, 52.5]
detractors = [32.5, 23.3, 24.0, 25.3, 26.0, 27.0]

"""
This is my code
"""
# The starting point of b3 
detractors_left_arry = np.sum([promoters, neutrals], axis=0).tolist()
print(detractors_left_arry)

b1 = plt.barh(year, promoters, color="darkseagreen")
b2 = plt.barh(year, neutrals, left=promoters, color="lightyellow")
# b3 = plt.barh(year, detractors, left=neutrals, color="coral")
b3 = plt.barh(year, detractors, left=detractors_left_arry, color="coral")

plt.legend([b1, b2, b3], ["promoters", "neutrals", "detractors_left_arry"],                 
loc="upper right")
plt.xlim([0, 100])

plt.show()

CodePudding user response:

The problem is that you are using the wrong reference in the last plt.barh command. The third bar must start at the end of the second one, so you need to sum the first and second values:

from matplotlib import pyplot as plt
       
year = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth']
promoters = [21, 20.8, 21.8,27,24,20.5]
neutrals = [46.5, 56.0, 54.3,47.8,50.0,52.5]
detractors = [32.5, 23.3, 24.0,25.3,26.0,27.0]

plt.figure()
b1 = plt.barh(year, promoters, color="darkseagreen")
b2 = plt.barh(year, neutrals, left=promoters, color="lightyellow")
b3 = plt.barh(year, detractors, left=[t1   t2 for t1, t2 in zip(promoters, neutrals)], color="coral")

plt.legend([b1, b2, b3], ["promoters", "neutrals", "detractors"], loc="upper right")
plt.xlim([0, 100])

plt.show()

enter image description here

  • Related