This is a time table, columns=hour, rows=weekday, data=subject [weekday x hour]
1 2 3 4 5 6 7
Name
Monday Project Project Project Data Science Embedded Systems Data Mining Industrial Psychology
Tuesday Project Project Project Project Data Science Industrial Psychology Embedded Systems
Wednesday Data Science Project Project Project Project Project Project
Thursday Data Mining Industrial Psychology Embedded Systems Data Mining Project Project Project
Friday Industrial Psychology Embedded Systems Data Science Data Mining Project Project Project
How do you generate a pandas.Dataframe
where, rows=weekday, columns=subject, data = subject frequency in the corresponding weekday?
Required table: [weekday x subject]
Data Mining, Data Science, Embedded Systems, Industrial Psychology, Project
Name
Monday 1 1 1 1 3
Tuesday ...
Wednesday
Thursday
Friday
self.file = 'timetable.csv'
self.sdf = pd.read_csv(self.file, header=0, index_col="Name")
print(self.sdf.to_string())
self.subject_frequency = self.sdf.apply(pd.value_counts)
print(self.subject_frequency.to_string())
self.subject_frequency["sum"] = self.subject_frequency.sum(axis=1)
CodePudding user response:
Use melt
to flatten your dataframe then pivot_table
to reshape your dataframe:
out = (
df.melt(var_name='Freq', value_name='Data', ignore_index=False).assign(variable=1)
.pivot_table('Freq', 'Name', 'Data', fill_value=0, aggfunc='count')
.loc[df.index] # sort by original index: Monday > Thuesday > ...
)
Output:
>>> out
Data Data Mining Data Science Embedded Systems Industrial Psychology Project
Name
Monday 1 1 1 1 3
Tuesday 0 1 1 1 4
Wednesday 0 1 0 0 6
Thursday 2 0 1 1 3
Friday 1 1 1 1 3