Home > OS >  Filling column based on conditions
Filling column based on conditions

Time:10-31

In the DataFrame below

df = pd.DataFrame([('Ve_Paper', 'Buy', '-','Canada',np.NaN),
                  ('Ve_Gasoline', 'Sell', 'Done','Britain',np.NaN),
                  ('Ve_Water', 'Sell','-','Canada,np.NaN),
                  ('Ve_Plant', 'Buy', 'Good','China',np.NaN),
                  ('Ve_Soda', 'Sell', 'Process','Germany',np.NaN)], columns=['Name', 'Action','Status','Country','Value'])

I am trying to update the Value column based on the following conditions, if Action is Sell check if the Status is not - if both the conditions are true then the first two characters of the Country needs to be updated as the Value column else if Status column is - and the Action column is Sell the Value column needs to be updated with the Name column without the characters Ve_, if Action is not Sell leave the Value column as np.NaN

But the output I am expecting is

       Name     Action    Status      Country       Value
   Ve_Paper        Buy         -       Canada      np.NaN # Because Action is not Sell
Ve_Gasoline       Sell      Done      Britain          Br # The first two characters of Country Since Action is sell and Status is not "-"
   Ve_Water       Sell         -       Canada       Water # The Name value without 'Ve_' since Action is Sell and the Status is '-'
   Ve_Plant        Buy      Good        China      np.NaN  
    Ve_Soda       Sell    Process     Germany          Ge

I have tried with np.where and df.loc both didn't work. Please do help me because I am out of options now

What I have tried so far is

import numpy as np
df['Value'] = np.where(df['Action']== 'Sell',df['Country'].str[:2] if df['Status'].str != '-' else df['Name'].str[3:],df['Value'])

but I am getting the output as <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0> wherever Iam trying to extract substrings so the output looks like this

       Name     Action    Status      Country       Value
   Ve_Paper        Buy         -       Canada      np.NaN
Ve_Gasoline       Sell      Done      Britain  <pandas.core.strings.StringMethods object at 662B0>
   Ve_Water       Sell         -       Canada  <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>
   Ve_Plant        Buy      Good        China      np.NaN  
    Ve_Soda       Sell    Process     Germany  <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>

CodePudding user response:

You have two conditions ,we can do it with np.select

conda = df.Action.eq('Sell')
condb = df.Status.eq('-')
df['value'] = np.select([conda&condb, conda&~condb],
                        [df.Name.str.split('_').str[1],df.Country.str[:2]],
                        default = np.nan)
df
Out[343]: 
          Name Action   Status  Country  Value  value
0     Ve_Paper    Buy        -   Canada    NaN    NaN
1  Ve_Gasoline   Sell     Done  Britain    NaN     Br
2     Ve_Water   Sell        -   Canada    NaN  Water
3     Ve_Plant    Buy     Good    China    NaN    NaN
4      Ve_Soda   Sell  Process  Germany    NaN     Ge
  • Related