Python DataFrame add a new columns based on multiple columns condition-CodePudding

I would like to add a new column called XXX based on the last letter in the "indicator" column if:

indicator ends with an S -> Use the value from 'Vendor 5 Ask Price XXX'

indicator ends with an B -> Use the value from 'Vendor 5 Bid Price XXX'

so the new column would be : XXX : [100,nan,107,103]

df = {'indicator': ['45346B','24536S','34636S','657363B'],
'Vendor 5 Bid Price XXX' : [100,None,102,103],
'Vendor 5 Ask Price XXX' : [105,None,107,108]}

pd.DataFrame(df)

  indicator  Vendor 5 Bid Price XXX  Vendor 5 Ask Price XXX
0    45346B                   100.0                   105.0
1    24536S                     NaN                     NaN
2    34636S                   102.0                   107.0
3   657363B                   103.0                   108.0

CodePudding user response：

What about

df[‘XXX’] = df.apply(
    lambda row: row[‘Vendor 5 Ask Price XXX’] if row[‘indicator’].ends with(‘S’) else row[‘Vendor 5 Bid Price XXX’],
    axis=1
)

.apply(…, axis=1) will apply the function to every row. The lambda function is just the implementation of the switch logic you mentioned and can be more complex if needed.

CodePudding user response：

Assuming the indicator column only ends in B or S, you can use numpy.where, using the Bid Price if the indicator ends with B, otherwise the Ask Price:

df['XXX'] = np.where(df['indicator'].str.endswith('B'), df['Vendor 5 Bid Price XXX'], df['Vendor 5 Ask Price XXX'])

Output:

  indicator  Vendor 5 Bid Price XXX  Vendor 5 Ask Price XXX    XXX
0    45346B                   100.0                   105.0  100.0
1    24536S                     NaN                     NaN    NaN
2    34636S                   102.0                   107.0  107.0
3   657363B                   103.0                   108.0  103.0