Home > Software design >  Python DataFrame add a new columns based on multiple columns condition
Python DataFrame add a new columns based on multiple columns condition

Time:05-27

I would like to add a new column called XXX based on the last letter in the "indicator" column if:

indicator ends with an S -> Use the value from 'Vendor 5 Ask Price XXX'

indicator ends with an B -> Use the value from 'Vendor 5 Bid Price XXX'

so the new column would be : XXX : [100,nan,107,103]

df = {'indicator': ['45346B','24536S','34636S','657363B'],
'Vendor 5 Bid Price XXX' : [100,None,102,103],
'Vendor 5 Ask Price XXX' : [105,None,107,108]}

pd.DataFrame(df)

  indicator  Vendor 5 Bid Price XXX  Vendor 5 Ask Price XXX
0    45346B                   100.0                   105.0
1    24536S                     NaN                     NaN
2    34636S                   102.0                   107.0
3   657363B                   103.0                   108.0

CodePudding user response:

What about

df[‘XXX’] = df.apply(
    lambda row: row[‘Vendor 5 Ask Price XXX’] if row[‘indicator’].ends with(‘S’) else row[‘Vendor 5 Bid Price XXX’],
    axis=1
)

.apply(…, axis=1) will apply the function to every row. The lambda function is just the implementation of the switch logic you mentioned and can be more complex if needed.

CodePudding user response:

Assuming the indicator column only ends in B or S, you can use numpy.where, using the Bid Price if the indicator ends with B, otherwise the Ask Price:

df['XXX'] = np.where(df['indicator'].str.endswith('B'), df['Vendor 5 Bid Price XXX'], df['Vendor 5 Ask Price XXX'])

Output:

  indicator  Vendor 5 Bid Price XXX  Vendor 5 Ask Price XXX    XXX
0    45346B                   100.0                   105.0  100.0
1    24536S                     NaN                     NaN    NaN
2    34636S                   102.0                   107.0  107.0
3   657363B                   103.0                   108.0  103.0
  • Related