Home > Software engineering >  fetching rows with certain conditions and data framing them
fetching rows with certain conditions and data framing them

Time:09-28

Current Dataframe :

key        status           score
A           initial            2
A           in-between         3
A           intermediate       6
A           pre-final          2
A           final              3
B           initial            1
B           intermediate       7
B           final              8

I am looking to get two seperate dataframe , dataframe 1 should only include rows before the status is intermediate and dataframe 2 should only include rows after the status is intermediate

Expected Dataframes :

Dataframe 1 :

   key        status           score
    A           initial            2
    A           in-between         3
    B           initial            1

Dataframe 2 :

key        status           score
A           pre-final          2
A           final              3
B           final              8

Tried :

cond = df.status.eq("intermediate")
mask = cond.groupby(df.key).cummax().shift(fill_value=False)
dataframe_1 = df[~mask]
dataframe_2 = df[mask]  

with this i am not getting the required dataframes

CodePudding user response:

Remove shift for first mask and for second filter out intermediate values:

cond = df.status.eq("intermediate")
mask = cond.groupby(df.key).cummax()

dataframe_1 = df[~mask]
print (dataframe_1)
  key      status  score
0   A     initial      2
1   A  in-between      3
5   B     initial      1

dataframe_2 = df[mask & ~cond]
print (dataframe_2)
  key     status  score
3   A  pre-final      2
4   A      final      3
7   B      final      8

CodePudding user response:

You can use a groupby.cummin to set all rows from "intermediate" to False, then use boolean indexing:

m1 = df['status'].ne('intermediate')
m2 = m1.groupby(df['key']).cummin()

dataframe_1 = df[m2]
#   key      status  score
# 0   A     initial      2
# 1   A  in-between      3
# 5   B     initial      1

dataframe_2 = df[m1 & ~m2]
#   key        status  score
# 3   A     pre-final      2
# 4   A         final      3
# 7   B         final      8
  • Related