Home > OS >  Updating column values based on multiple conditions
Updating column values based on multiple conditions

Time:10-31

In the DataFrame below

df = pd.DataFrame([('Ve_Paper', 'Buy', '-','Canada',np.NaN),
                  ('Ve_Gasoline', 'Sell', 'Done','Britain',np.NaN),
                  ('Ve_Water', 'Sell','-','Canada,np.NaN),
                  ('Ve_Plant', 'Buy', 'Good','China',np.NaN),
                  ('Ve_Soda', 'Sell', 'Process','Germany',np.NaN)], columns=['Name', 'Action','Status','Country','Value'])

I am trying to update the Value column based on the condition if Action is Sell check if the Status is not - if that is true then the first two characters of the Country needs to be updated as the Value column else if Status column is - the Value column needs to be updated with the Name column without the characters Ve_, if Action is not Sell leave the Value column as np.NaN

What I have tried so far is

import numpy as np
df['Value'] = np.where(df['Action']== 'Sell',df['Country'].str[:2] if df['Status'].str != '-' else df['Name'].str[3:],df['Value'])

but I am getting the output as <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0> wherever Iam trying to extract substrings so the output looks like this

       Name     Action    Status      Country       Value
   Ve_Paper        Buy         -       Canada      np.NaN
Ve_Gasoline       Sell      Done      Britain  <pandas.core.strings.StringMethods object at 662B0>
   Ve_Water       Sell         -       Canada  <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>
   Ve_Plant        Buy      Good        China      np.NaN  
    Ve_Soda       Sell    Process     Germany  <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>

But the output I am expecting is

       Name     Action    Status      Country       Value
   Ve_Paper        Buy         -       Canada      np.NaN # Because Action is not Sell
Ve_Gasoline       Sell      Done      Britain          Br # The first two characters of Country Since Action is sell and Status is not "-"
   Ve_Water       Sell         -       Canada       Water # The Name value without 'Ve_' since Action is Sell and the Status is '-'
   Ve_Plant        Buy      Good        China      np.NaN  
    Ve_Soda       Sell    Process     Germany          Ge

As mentioned in the answer below the method I tried works for someone but doesn't work for me. Is there any alternative approaches or better approaches than what I have tried, since that is not working for me

CodePudding user response:

enter image description here

I do not see any problem in here. Could you please elaborate it more if possible?

I am sharing my piece of code below:

import pandas as pd
import numpy as np
df = pd.DataFrame([('Ve_Paper', 'Buy', '-','Canada',np.NaN),
                  ('Ve_Gasoline', 'Sell', 'Done','Britain',np.NaN),
                  ('Ve_Water', 'Sell','-','Canada',np.NaN),
                  ('Ve_Plant', 'Buy', 'Good','China',np.NaN),
                  ('Ve_Soda', 'Sell', 'Process','Germany',np.NaN)], columns=['Name', 'Action','Status','Country','Value'])

df['Value'] = np.where(df['Action']== 'Sell',df['Country'].str[:2] if df['Status'].str != '-' else df['Name'].str[3:],df['Value'])
print(df)

You can also try with the below code:

df['Value'] = np.where(df['Action']== 'Sell', df['Country'].astype(str).str[0:2], df['Value'])

CodePudding user response:

Using np.select was a great choice

conda = df.Action.eq('Sell')
condb = df.Status.eq('-')
df['value'] = np.select([conda&condb, conda&~condb],
                        [df.Name.str.split('_').str[1],df.Country.str[:2]],
                        default = np.nan)
df
Out[343]: 
          Name Action   Status  Country  Value  value
0     Ve_Paper    Buy        -   Canada    NaN    NaN
1  Ve_Gasoline   Sell     Done  Britain    NaN     Br
2     Ve_Water   Sell        -   Canada    NaN  Water
3     Ve_Plant    Buy     Good    China    NaN    NaN
4      Ve_Soda   Sell  Process  Germany    NaN     Ge
  • Related