Home > Back-end >  Conditions for a group of values
Conditions for a group of values

Time:09-28

I'm trying to concatenate some conditions but it's getting quite hard to solve it, here's the problem:

I have a df with 3 columns:

| Column1 | Column2 | Column3 |
|:-------:|:-------:|:-------:|
| group1  | A       | x       |
| group1  | B       | p       |
| group1  | C       | y       |
| group2  | A       | x       |
| group2  | B       | j       |
| group2  | C       | y       |
| group3  | A       | x       |
| group3  | B       | p       |
| group3  | C       | z       |
| group4  | A       | h       |
| group4  | B       | p       |
| group4  | C       | y       |

So, what I'm trying to do is to check if for each group of values all the conditions are met. If one of them doesn't, all the group should be "KO". Conditions: -For groupx

  • A = x or h
  • B = p
  • C = y

The output should be like this:

| Column1 | Column2 | Column3 | Column4 | Column5 |
|:-------:|:-------:|:-------:|:-------:|:-------:|
| group1  | A       | x       | OK      | OK      |
| group1  | B       | p       | OK      | OK      |
| group1  | C       | y       | OK      | OK      |
| group2  | A       | x       | OK      | KO      |
| group2  | B       | j       | KO      | KO      |
| group2  | C       | y       | OK      | KO      |
| group3  | A       | x       | OK      | KO      |
| group3  | B       | p       | OK      | KO      |
| group3  | C       | z       | KO      | KO      |
| group4  | A       | h       | OK      | OK      |
| group4  | B       | p       | OK      | OK      |
| group4  | C       | y       | OK      | OK      |

In column4 we have result for each row of the df and in column5 we check that or each group all the conditions are satisfied.

Thanks a lot for the help!

CodePudding user response:

Try this... A simple function called in apply-lambda then for loop to get the desired results... Not sure this is the optimal way though...

df = pd.DataFrame({"Column1":[1,1,1,2,2,2,3,3,3,4,4,4],
                   "Column2":["A","B","C","A","B","C","A","B","C","A","B","C"],
                   "Column3":["x","p","y","x","j","y","x","p","z","h","p","y"]})
# Column4
def fun(x,y):
    if (x=='A' and y=='x') or (x=='A' and y=='h') or (x=='B' and y=='p') or (x=='C' and y=='y'):
        return "OK"
    else:
        return "KO"
df["Column4"] = df.apply(lambda x: fun(x["Column2"], x["Column3"]), axis=1)

# Column5
lst = []
for i in df["Column1"].unique():
    df_temp = df[df["Column1"] == i]
    if "KO" in list(df_temp["Column4"]):
        lst = lst   ["KO"]*df_temp.shape[0]
    else:
        lst = lst   ["OK"]*df_temp.shape[0]
df["Column5"] = lst

Hope this Helps...

# Output of df

    Column1 Column2 Column3 Column4 Column5
0         1       A       x      OK      OK
1         1       B       p      OK      OK
2         1       C       y      OK      OK
3         2       A       x      OK      KO
4         2       B       j      KO      KO
5         2       C       y      OK      KO
6         3       A       x      OK      KO
7         3       B       p      OK      KO
8         3       C       z      KO      KO
9         4       A       h      OK      OK
10        4       B       p      OK      OK
11        4       C       y      OK      OK

CodePudding user response:

Actually my answer is pretty similar to that @Sachin Koli's, but we took a different approach for Column5

import pandas as pd
df = pd.DataFrame({
    'Column1':['group1', 'group1','group1','group2','group2','group2','group3','group3','group3','group4','group4','group4'],
    'Column2':['A', 'B','C','A','B','C','A','B','C','A','B','C'],
    'Column3':['x', 'p','y','x','j','y','x','p','z','h','p','y']
    }, columns=['Column1', 'Column2', 'Column3', 'Column4', 'Column5'])

def helper(i, j):
    if (i=='A' and (j=='x' or j=='h')) or (i=='B' and j=='p') or (i=='C' and j=='y'):
        return 'OK'
    return 'KO'

df['Column4'] = df.apply(lambda x: helper(x[1], x[2]), axis=1)
    
for idx in df.index:    
    grp = df['Column1'].iloc[idx]
    df['Column5'].iloc[idx] = 'OK' if all(df[df['Column1']==grp]['Column4'] == 'OK') else 'KO'

UPDATE

You could use a list comprehension for Column5

df['Column5'] = ['OK' if all(df[df['Column1']==grp]['Column4'] == 'OK') else 'KO' for grp in df['Column1']]
  • Related