Hi i'm looking for possible way to use pandas to select data from min(pre) and max(pos) date of each group ( USERNAME) and reshape them back to the column. I'm able to get min/max of the group from 'groupby' but still unable to figure out how can i put them back in to column
thanks a for all advices
here is the example datatable...
Username log_date Score
AA 20211227020024 8
BB 20211227020024 26
CC 20211227020024 78
DD 20220122153004 12
AA 20220122153004 13
CC 20220122153004 0
AA 20220122153004 12
BB 20220124002736 10
CC 20220124002736 17
and here is the expected output..
USERNAME Pre_Date Pos_Date Pre Pos
AA 20211227020024 20220122153004 8 25
BB 20211227020024 20220124002736 26 10
CC 20211227020024 20220124002736 78 17
DD 20220122153004 - 12 0
CodePudding user response:
try:
df.groupby('Username').agg((min, max))
result:
log_date Score
min max min max
Username
AA 20211227020024 20220122153004 8 13
BB 20211227020024 20220124002736 10 26
CC 20211227020024 20220124002736 0 78
DD 20220122153004 20220122153004 12 12
then you can rename your columns as you wish
CodePudding user response:
Is this what you are expecting?
df = df.groupby('Username').agg({'log_date': ['min', 'max'], 'Score': ['min', 'max']})
df.columns = ["_".join(x) for x in df.columns]
df = df.reset_index()
O/P:
Username log_date_min log_date_max Score_min Score_max
0 AA 20211227020024 20220122153004 8 13
1 BB 20211227020024 20220124002736 10 26
2 CC 20211227020024 20220124002736 0 78
3 DD 20220122153004 20220122153004 12 12
Explanation: Step-1: Perform Groupby on Username column and do aggregation to find out min and Max Step-2: Update Column names