So iam trying to make a bar chart with plotly and pandas, to plot the mean rating for different categories of a data frame I have. Each category has a rating of 1-5, and categories are the columns that have "rating" at the end of them. I can get it to work all fine. However, something I want to account for is that in my categories (different columns in the dataframe) they have a value of -1 if that category was not rated. I want to know when I calculate the mean and go to plot the chart, how can I ensure that the -1 one values are not being accounted during the mean calculations?
my code
# Plot to find mean rating for different categories
# Take columns we are interested in and stack them into column named 'category'
# This will allow us to group by category and calculate mean rating
dfm = pd.melt(df, id_vars=["id", "course", "date", "overall_rating", "job_prospects_desc", "course_lecturer_desc", "facilities_desc", "student_support_desc", "local_life_desc"],
value_vars=["job_prospects_rating", "course_lecturer_rating", "facilities_rating", "student_support_rating", "local_life_rating"],
var_name ='Category')
# Group by category and calculate mean rating
dfg = dfm.groupby(['Category']).mean().reset_index()
print(df)
fig2 = px.bar(dfg, x = 'Category', y = 'value', color = 'Category',
category_orders = {'Category':['job_prospects_rating','course_lecturer_rating','facilities_rating','student_support_rating','local_life_rating']},
color_discrete_map = {
'job_prospects_rating' : 'light blue',
'course_lecturer_rating' : 'blue',
'facilities_rating' : 'pink',
'student_support_rating' : 'purple',
'local_life_rating' : 'violet'},
title="Mean Rating For Different Student Categories At The University of Lincoln")
fig2.update_yaxes(title = 'Mean rating (1-5)')
fig2.show()
CodePudding user response:
df = pd.DataFrame(data= np.array([[0,0,0,1,1,1],[1,2,-1,4,5,6],[7,8,9,10,-1,12]]).T, columns = ['Category', 'A', 'B'])
df1 = df.applymap(lambda x: x if x!= -1 else np.NaN)
df1.groupby(['Category']).mean()
The logic is extremely simple: substitute those '-1' with NaN and forget about them