Home > Back-end >  False option returning in np select?
False option returning in np select?

Time:11-24

I made this np select but AND operators don't work!

df = pd.DataFrame({'A': [2107], 'B': [76380700]})
cond = [(df["A"]==2107)|(df["A"]==6316)&(df['B']>=10000000)&(df['B']<=19969999),
    (df["A"]==2107)|(df["A"]==6316)&(df['B']>=1000000)&(df['B']<=99999999)]
choices    =["Return 1", "Return 2"]
df["C"] = np.select(cond, choices, default = df["A"])

NP select return "Return 1" but correct option is "Return 2"

>>df["C"]
0    Return 1

Cause this line return false

>>df["B"]<=19969999
False

How can I solve this problem?

CodePudding user response:

It's an operator precendence issue. Here's what you wrote:

cond = [
    (df["A"]==2107) |
    (df["A"]==6316) &
    (df['B']>=10000000) &
    (df['B']<=19969999),

    (df["A"]==2107) |
    (df["A"]==6316) &
    (df['B']>=1000000) &
    (df['B']<=99999999)
]

Here's how that is interpreted:

cond = [
    (df["A"]==2107) |
    (
        (df["A"]==6316) &
        (df['B']>=10000000) &
        (df['B']<=19969999)
    ),

    (df["A"]==2107) |
    (
        (df["A"]==6316) &
        (df['B']>=1000000) &
        (df['B']<=99999999)
    )
]

You need parens around the "or" clause:

cond = [
    ( (df["A"]==2107) | (df["A"]==6316) ) &
        (df['B']>=10000000) &
        (df['B']<=19969999),

    ( (df["A"]==2107) | (df["A"]==6316) ) &
        (df['B']>=1000000) &
        (df['B']<=99999999)
    )
]

And, by the way, there is absolutely nothing wrong with writing the expressions like I did there. Isn't it much more clear what's going on when it's spaced out like that?

CodePudding user response:

I think you were missing parenthesis for (df["A"]==2107)|(df["A"]==6316). In your script, condition for Return 1 was checking (df["A"]==2107)|(df["A"]==6316))&(df['B']>=10000000)&(df['B']<=19969999) which means A==2107 OR (A == 6316 & B... & B... ). That's why np.select returns 'Returns 1', because it is True.

df = pd.DataFrame({'A': [2107], 'B': [76380700]})
cond = [((df["A"]==2107)|(df["A"]==6316))&(df['B']>=10000000)&(df['B']<=19969999),
    (df["A"]==2107)|(df["A"]==6316)&(df['B']>=1000000)&(df['B']<=99999999)]
choices    =["Return 1", "Return 2"]
df["C"] = np.select(cond, choices, default = df["A"])
  • Related