Home > Software design >  python pandas : how to merge multiple columns into one column and use a pie chart
python pandas : how to merge multiple columns into one column and use a pie chart

Time:01-03

pd.DataFrame([["Stress", "NaN"], ["NaN", "Pregnancy"], ["Alcohol", "Pregnancy"]], columns=['causes', 'causes.2'])

I have a sample dataset here, technically, these columns should have been merged to one but for some reason, they weren't. now, I am tasked to make a pie chart and I do know how to do that with one column hence I want to merge these columns into a single column with a distinct name.

I tried using df.stack().reset_index() but that gives me a weird object I do not know how to manipulate:

    level_0 level_1 0
0   0   causes  Stress
1   0   causes.2    NaN
2   1   causes  NaN
3   1   causes.2    Pregnancy
4   2   causes  Alcohol
5   2   causes.2    Pregnancy

Anyone know how I could achieve this?

I plan on using for the pie chart:

values = df["Cause of...."].value_counts()
ax = values.plot(kind="pie", autopct='%1.1f%%', shadow=True, legend=True, title="", ylabel='', labeldistance=None)
ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')
plt.show()

CodePudding user response:

You can flatten using the underlying numpy array and create a new Series:

pd.Series(df.to_numpy().ravel(), name='causes')

Output:

0       Stress
1          NaN
2          NaN
3    Pregnancy
4      Alcohol
5    Pregnancy
Name: causes, dtype: object

If you have many columns, you need to select only the ones you want to flatten, for example selecting by name:

pd.Series(df.filter(like='causes').to_numpy().ravel(), name='causes')
  • Related