Minimum Reproducible Example:
df = pd.DataFrame({'event_name': ['fulham','fulham','fulham','fulham','fulham','fulham'],
'batfast_id': ['bfs1', 'bfs1', 'bfs1', 'bfs1', 'bfs1', 'bfs1'],
'session_no': [1,1,1,1,1,1],
'overs': [0,0,0,0,0,0],
'deliveries_faced': [0,1,2,3,4,5],
'length/type': ['ES_LS_Y','ES_LS_Y','S_S_Y','ES_OS_Y','ES_LS_Y','ES_LS_Y']}, columns=['event_name', 'batfast_id','session_no','overs', 'deliveries_faced','length/type'])
df = df.set_index(['event_name', 'batfast_id','session_no','overs', 'deliveries_faced'],drop=True)
print(df)
There are 6 deliveries_faced
in an over. I then produce a sequence
column that gives the sequence of 6 length/type
in each over using this code:
df['sequence'] = (df.groupby(["event_name", "batfast_id", "session_no", "overs"])["length/type"]
.apply(lambda x: ",".join(x)).loc[lambda x: x.str.count(",") == 5]
)
However I want to number each delivery in the sequence
. eg 'ES_LS_Y1','ES_LS_Y2','S_S_Y3','ES_OS_Y4','ES_LS_Y5','ES_LS_Y6'
or something along those lines that uniquely numbers each delivery.
CodePudding user response:
You can try enumerating the x and then use the index in a format string, check the code below.
df['sequence'] = (df.groupby(["event_name", "batfast_id", "session_no", "overs"])["length/type"]
.apply(lambda x: ','.join(f'{val}_{i}' for i, val in enumerate(x)))
)
CodePudding user response:
Use enumerate
:
out = df.groupby(level=df.index.names[:-1])['length/type'] \
.apply(lambda x: ','.join(f"{v}{i}" for i, v in enumerate(x.tolist(), 1)))
print(out)
# Output:
length/type
event_name batfast_id session_no overs deliveries_faced
fulham bfs1 1 0 0 ES_LS_Y
1 ES_LS_Y
2 S_S_Y
3 ES_OS_Y
4 ES_LS_Y
5 ES_LS_Y