In the DataFrame
below
df = pd.DataFrame([('Ve_Paper', 'Buy', '-','Canada',np.NaN),
('Ve_Gasoline', 'Sell', 'Done','Britain',np.NaN),
('Ve_Water', 'Sell','-','Canada,np.NaN),
('Ve_Plant', 'Buy', 'Good','China',np.NaN),
('Ve_Soda', 'Sell', 'Process','Germany',np.NaN)], columns=['Name', 'Action','Status','Country','Value'])
I am trying to update the Value
column based on the following conditions, if Action
is Sell
check if the Status
is not -
if both the conditions are true then the first two characters of the Country
needs to be updated as the Value
column else if Status
column is -
and the Action column is Sell
the Value
column needs to be updated with the Name
column without the characters Ve_
, if Action
is not Sell
leave the Value
column as np.NaN
But the output
I am expecting is
Name Action Status Country Value
Ve_Paper Buy - Canada np.NaN # Because Action is not Sell
Ve_Gasoline Sell Done Britain Br # The first two characters of Country Since Action is sell and Status is not "-"
Ve_Water Sell - Canada Water # The Name value without 'Ve_' since Action is Sell and the Status is '-'
Ve_Plant Buy Good China np.NaN
Ve_Soda Sell Process Germany Ge
I have tried with np.where and df.loc both didn't work. Please do help me because I am out of options now
What I have tried so far is
import numpy as np
df['Value'] = np.where(df['Action']== 'Sell',df['Country'].str[:2] if df['Status'].str != '-' else df['Name'].str[3:],df['Value'])
but I am getting the output as <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>
wherever Iam trying to extract substrings
so the output looks like this
Name Action Status Country Value
Ve_Paper Buy - Canada np.NaN
Ve_Gasoline Sell Done Britain <pandas.core.strings.StringMethods object at 662B0>
Ve_Water Sell - Canada <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>
Ve_Plant Buy Good China np.NaN
Ve_Soda Sell Process Germany <pandas.core.strings.StringMethods object at 0x000001EDB8F662B0>
CodePudding user response:
You have two conditions ,we can do it with np.select
conda = df.Action.eq('Sell')
condb = df.Status.eq('-')
df['value'] = np.select([conda&condb, conda&~condb],
[df.Name.str.split('_').str[1],df.Country.str[:2]],
default = np.nan)
df
Out[343]:
Name Action Status Country Value value
0 Ve_Paper Buy - Canada NaN NaN
1 Ve_Gasoline Sell Done Britain NaN Br
2 Ve_Water Sell - Canada NaN Water
3 Ve_Plant Buy Good China NaN NaN
4 Ve_Soda Sell Process Germany NaN Ge