Home > database >  how to split data in groups by two column conditions pandas
how to split data in groups by two column conditions pandas

Time:06-15

i have dataframe , i want to split dataframe in groups based on condition from flag_0 and flag_1 column , when flag_0 is '3' and and flag_1 is '1' continous.

here is my dataframe example:
df=pd.DataFrame({'flag_0':[1,2,3,1,2,3,1,2,3,3,3,3,1,2,3,1,2,3,4,4],'flag_1':[1,2,3,1,2,3,1,2,1,1,1,1,1,2,1,1,2,3,4,4],'dd':[1,1,1,7,7,7,8,8,8,1,1,1,7,7,7,8,8,8,5,7]})

Out[172]: 
    flag_0  flag_1  dd
0        1       1   1
1        2       2   1
2        3       3   1
3        1       1   7
4        2       2   7
5        3       3   7
6        1       1   8
7        2       2   8
8        3       1   8
9        3       1   1
10       3       1   1
11       3       1   1
12       1       1   7
13       2       2   7
14       3       1   7
15       1       1   8
16       2       2   8
17       3       3   8
18       4       4   5
19       4       4   7

desired output

group_1

Out[172]: 
        flag_0  flag_1  dd

9        3       1       1
10       3       1       1
11       3       1       1

group 2

Out[172]: 
            flag_0  flag_1  dd
    
    14       3       1       7
   
  

CodePudding user response:

You can use a mask and groupby to split the dataframe:

cond = {'flag_0': 3, 'flag_1': 1}
mask = df[list(cond)].eq(cond).all(1)

groups = [g for k,g in df[mask].groupby((~mask).cumsum())]

output:

[    flag_0  flag_1  dd
 8        3       1   8
 9        3       1   1
 10       3       1   1
 11       3       1   1,
     flag_0  flag_1  dd
 14       3       1   7]
groups[0]

    flag_0  flag_1  dd
8        3       1   8
9        3       1   1
10       3       1   1
11       3       1   1
  • Related