Home > database >  How to slice a dataframe with list comprehension by using a condition that compares two consecutive
How to slice a dataframe with list comprehension by using a condition that compares two consecutive

Time:04-14

I have a dataframe:

dfx = pd.DataFrame()
dfx['Exercise'] = ['squats', 'squats', 'rest', 'rest', 'squats', 'squats', 'rest', 'situps']
dfx['Score'] = [8, 7, 6, 5, 4, 3, 2, 1]

By using a list comprehension (or any other technique other than looping), I want to create a list list_ex that contains those slices of dfx in which the consecutive exercise is the same as the current exercise. As soon as the consecutive exercise is not the same as the one in the current row, the next dataframe slice begins.

With respect to the example, that means that list_ex should contain 5 dataframes:

  • The first df contains two rows: ('squats', 8) and ('squats', 7)
  • The second df contains two rows: ('rest', 6) and ('rest', 5)
  • The third df contains two rows: ('squats', 4) and ('squats', 3)
  • The fourth df contains one row: ('rest', 2)
  • The fifth df contains one row: ('situps', 1) Each dataframe should have the same header as dfx.

Sorry for explaining this in such a way. I was not able to produce code for the desired list of dataframes.

I tried using a list comprehension but failed with respect to including the comparison of current and consecutive row. How can I reach the desired result?

CodePudding user response:

Is this what you want?

group = dfx['Exercise'].ne(dfx['Exercise'].shift()).cumsum()

list_ex = [g for _,g in dfx.groupby(group)]

output:

[  Exercise  Score
 0   squats      8
 1   squats      7,
   Exercise  Score
 2     rest      6
 3     rest      5,
   Exercise  Score
 4   squats      4
 5   squats      3,
   Exercise  Score
 6     rest      2,
   Exercise  Score
 7   situps      1]
  • Related