i have two dataframes Region_education_0 and Region_education_1
Region_education_0
index | Region | ConvertedComp |
---|---|---|
1 | Australia/New Zealand | 122573.834171 |
2 | Caribbean | 53562.111111 |
3 | Central Asia | 134422.000000 |
4 | East Asia | 112492.507042 |
5 | Melanesia | 605 |
Region_education_1
index | Region | ConvertedComp |
---|---|---|
1 | Australia/New Zealand | 122573.834171 |
2 | Caribbean | 53562.111111 |
3 | Central Asia | 134422.000000 |
4 | East Asia | 112492.507042 |
Index 5, Melanesia is not present in Region_education_1 because of a condition, i want to compare them and plot so i tried this
from matplotlib.pyplot import *
Region_education_combined=Region_education_0.merge(Region_education_1,left_on="Region",right_on="Region")
Region_education_combined.columns=["Region","Max of Bachelors Higher Ed","Higher Formal Education"]
Region_education_combined['Diff_HigherEd_Vals'] = Region_education_combined['Higher Formal Education'] - Region_education_combined['Max of Bachelors Higher Ed']
print(Region_education_combined)
comp_df.style.bar(subset=['Diff_HigherEd_Vals'], align='mid', color=['#d65f5f', '#5fba7d'])
index | Max of Bachelors Higher Ed | Higher Formal Education | Diff_HigherEd_Vals |
---|---|---|---|
1 | 151698.500659 | 122573.834171 | -29124.666488 |
2 | 28413.753425 | 53562.111111 | 53562.111111 |
3 | 3944.750000 | 5883.000000 | 1938.250000 |
4 | 45091.041667 | 27052.384615 | -18038.657051 |
Region column is missing from the output,to include I region tried
comp_df.style.bar(subset=['Diff_HigherEd_Vals','Region'], align='mid', color=['#d65f5f', '#5fba7d'])
and
comp_df.style.bar(Region_education_combined, align='mid', color=['#d65f5f', '#5fba7d'])
Is there any way to include region in the final output? and i left out "Index 5, Melanesia" from 'Region_education_0' dataframe is there any way to include that too in the output ?
CodePudding user response:
You can maintain the missing Region by using how="outer"
when you call merge
like this
Region_education_combined=Region_education_0.merge(Region_education_1,left_on="Region",right_on="Region")
Pay attention that in this case you will have a table which contains Nan
where it is not possible to merge, in your case Melanesia
will have a Nan
in the Higher Formal Education
column. In order to avoid problem you can set a default value with this
Region_education_combined["Higher Formal Education"]fillna(0, inplace=True)