Home > Mobile >  How to conditionally concat 2 columns in Python Pandas Dataframe
How to conditionally concat 2 columns in Python Pandas Dataframe

Time:09-14

I have the following dataframe:

d = {'col1': [1, "Avoid", 3, "Avoid"], 'col2': ["AA", "BB", "Avoid", "Avoid"]}
df = pd.DataFrame(data=d)
df

    col1    col2
0   1       AA
1   Avoid   BB
2   3       Avoid
3   Avoid   Avoid

I have to conditionally concat col1 and col2 into col3. Conditions:

  • Only concat 2 columns as long as none of them is Avoid.
  • If any of col1 and col2 is Avoid, col3 will be equal to Avoid as well.
  • When performing concatination, " & " needs to be added between column values in col3. For instance, first row of col3 will be "1 & AA".

The end result is supposed to look as the following:

    col1    col2    col3
0   1       AA      1 & AA
1   Avoid   BB      Avoid
2   3       Avoid   Avoid
3   Avoid   Avoid   Avoid

How can I do this without dealing with for loops?

CodePudding user response:

try this:

df["col3"]=df.apply(lambda x:"Avoid" if x["col1"]=="Avoid" or x["col2"]=="Avoid" else f"{x['col1']} & {x['col2']}",axis=1)

CodePudding user response:

df["col3"] = df["col1"]   " & "   df["col2"]
df["col3"] = df["col3"].apply(lambda x: "Avoid" if x.contains("Avoid") else x)

CodePudding user response:

Use boolean operations, this enables you to use an arbitrary number of columns:

# is any value in the row "Avoid"?
m = df.eq('Avoid').any(1)

# concatenate all columns unless there was a "Avoid"
df['col3'] = df.astype(str).agg(' & '.join, axis=1).mask(m, 'Avoid')

Alternative that should be faster if you have many rows and few with "Avoid":

m = df.ne('Avoid').all(1)

df.loc[m, 'col3'] = df[m].astype(str).agg(' & '.join, axis=1)

df['col3'] = df['col3'].fillna('Avoid')

output:

    col1   col2    col3
0      1     AA  1 & AA
1  Avoid     BB   Avoid
2      3  Avoid   Avoid
3  Avoid  Avoid   Avoid

CodePudding user response:

Is not an efficient way to work with pandas, but if you can't change the data structure these are solutions::

Solution 1:

def custom_merge(cols):
    if cols["col1"]=="Avoid" or cols["col2"]=="Avoid":
        return "Avoid"
    else:
        return f"{cols['col1']} & {cols['col2']}"
    
df['col3'] = df.apply(custom_merge, axis=1)

Solution 2:

df['col3'] = (df["col1"].astype(str)   " & "   df["col2"].astype(str)).apply(lambda x: "Avoid" if 'Avoid' in x else x)

Both solutions results in the following:

    col1    col2    col3
0   1       AA      1 & AA
1   Avoid   BB      Avoid
2   3       Avoid   Avoid
3   Avoid   Avoid   Avoid

Execution Time comparison

[in progress]

  • Related