Home > Software design >  How to create a column for 'number of wickets'
How to create a column for 'number of wickets'

Time:12-10

Example of what it should look like:

This is the output for table above:

{'batfast_id': {0: 'bfs1',
  1: 'bfs1',
  2: 'bfs1',
  3: 'bfs1',
  4: 'bfs1',
  5: 'bfs1',
  6: 'bfs1',
  7: 'bfs2',
  8: 'bfs2',
  9: 'bfs2',
  10: 'bfs2',
  11: 'bfs2',
  12: 'bfs2'},
 'score': {0: 1,
  1: 2,
  2: 0,
  3: 0,
  4: 1,
  5: 0,
  6: 3,
  7: 3,
  8: 0,
  9: 0,
  10: 2,
  11: 2,
  12: 0},
 'day_month_year': {0: '01-01-2020',
  1: '01-01-2020',
  2: '01-01-2020',
  3: '01-01-2020',
  4: '02-01-2020',
  5: '02-01-2020',
  6: '02-01-2020',
  7: '02-01-2020',
  8: '02-01-2020',
  9: '02-01-2020',
  10: '03-01-2020',
  11: '03-01-2020',
  12: '03-01-2020'},
 'runs': {0: 1,
  1: 3,
  2: 3,
  3: 3,
  4: 1,
  5: 1,
  6: 4,
  7: 3,
  8: 3,
  9: 3,
  10: 2,
  11: 4,
  12: 4},
 'deliveries_faced': {0: 1,
  1: 2,
  2: 3,
  3: 4,
  4: 1,
  5: 2,
  6: 3,
  7: 1,
  8: 2,
  9: 3,
  10: 1,
  11: 2,
  12: 3},
 'wicket': {0: 'Not Out',
  1: 'Not Out',
  2: 'Bowled',
  3: 'Caught',
  4: 'Not Out',
  5: 'Caught',
  6: 'Not Out',
  7: 'Not Out',
  8: 'Bowled',
  9: 'Bowled',
  10: 'Not Out',
  11: 'Not Out',
  12: 'Caught'},
 'no_of_wickets': {0: 0,
  1: 0,
  2: 1,
  3: 2,
  4: 0,
  5: 1,
  6: 1,
  7: 0,
  8: 1,
  9: 2,
  10: 0,
  11: 0,
  12: 1}} 

I am trying to create a column called no_of_wickets which counts how many wickets the user has lost playing cricket. However it must reset back to 0 when the date changes or batfast_id (the user) changes.

The score is the number of runs the player received on that delivery and the runs is the cumulative number of the score.

runs was created using this code: df['runs']=df.groupby(['batfast_id','day_month_year'])['score'].cumsum()

deliveries_faced using this code: df['deliveries_faced']=df.groupby(['batfast_id','day_month_year']).cumcount() 1

CodePudding user response:

Create a boolean mask where set to True if the wicket is out then group by batfast_id and day_month_year and finally compute cumulative sum.

df['no_of_wickets'] = df.assign(is_out=df['wicket'].ne('Not Out')) \
                        .groupby(['batfast_id', 'day_month_year'])['is_out'] \
                        .cumsum()
print(df)

# Output:
   batfast_id  score day_month_year  runs  deliveries_faced   wicket  no_of_wickets
0        bfs1      1     01-01-2020     1                 1  Not Out              0
1        bfs1      2     01-01-2020     3                 2  Not Out              0
2        bfs1      0     01-01-2020     3                 3   Bowled              1
3        bfs1      0     01-01-2020     3                 4   Caught              2
4        bfs1      1     02-01-2020     1                 1  Not Out              0
5        bfs1      0     02-01-2020     1                 2   Caught              1
6        bfs1      3     02-01-2020     4                 3  Not Out              1
7        bfs2      3     02-01-2020     3                 1  Not Out              0
8        bfs2      0     02-01-2020     3                 2   Bowled              1
9        bfs2      0     02-01-2020     3                 3   Bowled              2
10       bfs2      2     03-01-2020     2                 1  Not Out              0
11       bfs2      2     03-01-2020     4                 2  Not Out              0
12       bfs2      0     03-01-2020     4                 3   Caught              1
  • Related