Home > Back-end >  Enumerate values in a column
Enumerate values in a column

Time:12-23

Minimum Reproducible Example:

df = pd.DataFrame({'event_name': ['fulham','fulham','fulham','fulham','fulham','fulham'],
                      'batfast_id': ['bfs1', 'bfs1', 'bfs1', 'bfs1', 'bfs1', 'bfs1'],
                      'session_no': [1,1,1,1,1,1],
                      'overs': [0,0,0,0,0,0],
                      'deliveries_faced': [0,1,2,3,4,5],
                      'delivery_type': ['Extra Slow Leg Spin','Extra Slow Leg Spin','Slow Straight','Extra Slow Off Spin','Extra Slow Leg Spin','Extra Slow Leg Spin'],
                      'length': ['Yorker','Yorker','Yorker','Yorker','Yorker','Yorker']}, columns=['event_name', 'batfast_id','session_no','overs', 'deliveries_faced','delivery_type','length'])
df = df.set_index(['event_name', 'batfast_id','session_no','overs', 'deliveries_faced'],drop=True)
print(df)

I then produce a length/type column that is a combination of length and delivery_type using this code:

conditions = [
    (df['delivery_type'] == 'Extra Slow Off Spin') & (df['length'] == 'Yorker'),
    (df['delivery_type'] == 'Extra Slow Leg Spin') & (df['length'] == 'Yorker'),
    (df['delivery_type'] == 'Slow Straight') & (df['length'] == 'Yorker'),
    ]

values = ['ES_OS_Y', 'ES_LS_Y','S_S_Y']

df['length/type'] = np.select(conditions, values)
print(df)

The problem is that I wish to enumerate each delivery from 0-5 for each delivery of the over so that it looks like this:

                                                        delivery_type       length length/type
event_name batfast_id session_no overs deliveries_faced            
fulham     bfs1       1          0     0                Extra Slow Leg Spin Yorker   ES_LS_Y_0                                                  
                                       1                Extra Slow Leg Spin Yorker   ES_LS_Y_1
                                       2                Slow Straight       Yorker     S_S_Y_2
                                       3                Extra Slow Off Spin Yorker   ES_OS_Y_3
                                       4                Extra Slow Leg Spin Yorker   ES_LS_Y_4
                                       5                Extra Slow Leg Spin Yorker   ES_LS_Y_5

CodePudding user response:

Try:

df['length/type'] = df['length/type']   '_' \
                      df.groupby(df.index.names[:-1]).cumcount().astype(str)
print(df)

# Output:
                                                               delivery_type  length length/type
event_name batfast_id session_no overs deliveries_faced                                         
fulham     bfs1       1          0     0                 Extra Slow Leg Spin  Yorker   ES_LS_Y_0
                                       1                 Extra Slow Leg Spin  Yorker   ES_LS_Y_1
                                       2                       Slow Straight  Yorker     S_S_Y_2
                                       3                 Extra Slow Off Spin  Yorker   ES_OS_Y_3
                                       4                 Extra Slow Leg Spin  Yorker   ES_LS_Y_4
                                       5                 Extra Slow Leg Spin  Yorker   ES_LS_Y_5
  • Related