Home > OS >  Apply a change to the last row of each group by ID in a dataframe
Apply a change to the last row of each group by ID in a dataframe

Time:03-19

For the next dataframe, I want to modify two columns for the last row of each cluster.

country ID counter direction
ENG 0 1 OUT
ENG 0 0 IN
ESP 0 6 OUT
ENG 1 5 IN
ENG 1 5 IN
FRA 2 4 OUT
ENG 3 4 OUT

I want to obtain the following result:

country ID counter direction exit_to
ENG 0 1 OUT NaN
ENG 0 0 IN NaN
ESP 0 7 OUT E
ENG 1 5 IN NaN
ENG 1 5 IN NaN
FRA 2 4 OUT W
ENG 3 4 OUT E

For the last row of each group, I increment the value of the counter column by 1 and create a new column exit_to, with value E if direction = OUT and country ESP. If the direction of the last row is OUT the column exit_to, takes the value of W if country = FRA and E if country = ENG and the counter in both cases stays the same.

For the behaviour of the exit_to column, I have implemented the following code.

def exit_to(country, direction):
    zone = np.NaN
    if direction == 'OUT' and country == 'ESP':
        # 1 counter
        zone = 'E'
    elif direction == 'OUT' and country == 'FRA':
        zone = 'W'
    elif direction == 'OUT':
        zone = 'E'
        
    return zone

I was using the following function:

last_rows = df.drop_duplicates('ID', keep='last')
last_rows['exit_to'] = last_rows.apply(lambda a: exit_to(a['country'],a['direction']), axis=1)

The problem is that I want the change to affect the whole dataframe and I don't want to get only a dataframe with the last rows of each group. I hope you can help me.

CodePudding user response:

There is no need to use apply, here we can use a vectorized solution with boolean indexing:

m = ~df.duplicated('ID', keep='last')
df.loc[m, 'counter']  = 1
df.loc[m, 'exit_to'] = np.where(df.loc[m, 'direction'] == 'OUT', 'E', 'W')

Result

print(df)

   ID  counter direction exit_to
0   0        1       OUT     NaN
1   0        0        IN     NaN
2   0        7       OUT       E
3   1        5        IN     NaN
4   1        6        IN       W
5   2        5       OUT       E
  • Related