I have the following data frame.
Names Counts Year
0 Jordan 1043 2000
1 Steve 204 2000
2 Brock 3 2000
3 Steve 33 2000
4 Mike 88 2000
... ... ... ...
20001 Bryce 2 2015
20002 Steve 11 2015
20003 Penny 24 2015
20004 Steve 15 2015
20005 Ryan 5 2015
I want to output the information about the name "Steve" over all years. The output should combine the "Counts" for the name "Steve" if the name appears multiple times within the same year.
Example output might look like:
Names Counts Year
0 Steve 237 2000
1 Steve 400 2001
2 Steve 35 2002
... ... ... ...
15 Steve 26 2015
CodePudding user response:
do you want something like this ?
#first
cols=['Counts','Year']
df[cols]=df[cols].astype('int32')
df=df[df['Names']=='Steve']
df=df.groupby('Year')['Counts'].agg({'sum'})
CodePudding user response:
Filter records for Steve
then groupby Year
, and finally calculate the aggregates i.e. first
for Names
, and sums
for Counts
(df[df['Names'].eq('Steve')]
.groupby('Year')
.agg({'Names': 'first', 'Counts': sum})
.reset_index())
Year Names Counts
0 2000 Steve 237
1 2015 Steve 26