code_presentation code_module score id_student id_assessment date_submitted
0 2013J AAA 78.0 11391 1752 18
1 2013J AAA 70.0 11391 1800 22
2 2013J AAA 72.0 31604 1752 17
3 2013J AAA 69.0 31604 1800 26
.....
I need to count submitted days and How to groupby it right ti get a result such as :
id_student id_assessment date_submitted
11391 1752 1
1800 1
31604 1752 1
1800 1
... etc
I try:
analasys_grouped = analasys.groupby ( 'id_student', as_index = False)\
.agg({'id_assessment':'count', 'date_submitted': 'count'})
analasys_grouped
but it is not working right
CodePudding user response:
If I understand you correctly, you want to apply value_counts()
on id_assessment
grouped by id_student
. Try:
assessment_count_per_student = df.groupby('id_student')['id_assessment'].value_counts()
print(assessment_count_per_student)
id_student id_assessment
11391 1752 1
1800 1
31604 1752 1
1800 1
Name: id_assessment, dtype: int64
CodePudding user response:
you need to pass id_assessment
into the groupby
statement.
df.groupby(['id_student', 'id_assessment'])['date_submitted'].count()
id_student id_assessment
11391 1752 1
1800 1
31604 1752 1
1800 1
in your attempt, you're only grouping by id_student
then counting the assesment and date submitted.