Home > Software engineering >  Pandas: set var using multiple conditions
Pandas: set var using multiple conditions

Time:09-25

Given the following dataframe:

df = pd.DataFrame({"code": ["codeA",
                            "Codeb",
                            "codeB",
                            "codea",
                            "N/A",
                            "N/A"], 
                   "warehouse": [20, 
                                 30, 
                                 10,
                                 30,
                                 10,
                                 70]})

I need to set a variable in a column according to three conditions:

  1. value = codeA
  2. value = codeB
  3. value is anything other than codeA or codeB = ""

Pseudocode:

# account for case: make case insensitive
if value REGEX '(?i)codeA':
   value = "product A"
else if value REGEX '(?1)codeB':
   value = "product B"
else
   value = ""

Would I use a function with apply?

I can do the first 2 like:

df['code'].replace(to_replace="(?i)CodeA", value="Product A", inplace=True, regex=True)
df['code'].replace(to_replace="(?i)CodeB", value="Product B", inplace=True, regex=True)

However -- I'm stuck on trying to say: "if it doesn't match either" set to "". Also wondering if there's a more efficient way to do this with an "else" clause.

NOTE: The ideal solution would account for human error in the input -- Eg., case insensitive. I do a strip beforehand to account for trailing and leading spaces, however.

CodePudding user response:

Use a dict mapping

d = {'codea': 'Product A', 'codeb': 'Product B'}
df['code'] = df['code'].str.replace(' ', '').str.casefold()
df['code'] = df['code'].map(d).fillna('')

Output:

        code  warehouse
0  Product A         20
1  Product B         30
2  Product B         10
3  Product A         30
4                    10
5                    70

CodePudding user response:

A more general approach to setting a column value based off a mapping to another column would be to use map

mapping = {'codeA': 'product A', 'codeB': 'product B'}
df['mapped_product'] = df.code.map(mapping)
df

Out

    code  warehouse mapped_product
0  codeA         20      product A
1  codeB         30      product B
2  codeB         10      product B
3  codeA         30      product A
4    N/A         10            NaN
5    N/A         70            NaN

If all you're doing is a string replace, you could do it this way:

df['mapped_product'] = df.code.str.replace('code', 'product ')
df

Out

    code  warehouse mapped_product
0  codeA         20      product A
1  codeB         30      product B
2  codeB         10      product B
3  codeA         30      product A
4    N/A         10            N/A
5    N/A         70            N/A
  • Related