Home > Mobile >  How can I find sequences of rows based on two columns?
How can I find sequences of rows based on two columns?

Time:12-17

THis is my dataframe:

Screenshot of dataframe

I am trying to find the most common sequences of length and delivery_type combined per over. ie every six balls. Is there any way of finding how many times each sequence happens?

edit: I'd also like to label these patterns and create a column called ball_sequence

edit2: I have now combined delivery_type and length in the length/type column. eg ESSY is 'Extra Short Slow Yorker'

New DataFrame

I have also created a small table of expected outcome. The sequence must occur in the same over and cannot be any random sequence of 6:

Expected output:

CodePudding user response:

The following should work:

(df.groupby(["Event_name", "Batfast_id", "Session_no", "Overs"])["length/type"]
   .apply(lambda x: ",".join(x))  ## Creates sequences for each over
   .value_counts()                ## Returns counts of sequences
)
  • Related