my df is something like this;
import pandas as pd
import numpy as np
df = pd.DataFrame({'entity':['BSS'],
'activity':[np.nan],
'division':['BSS'],
'site':['Tanger']})
i want to take the value of column 'division' which is 'BSS' and put it into column activity if only division is 'BSS'. I managed it by df.loc[df['Division'] == 'BSS', 'Activity'] = 'BSS'
but while i am doing it i also want to insert the value 'europe' into column 'division' if only column 'site' is 'Tanger'
I tried if
and else
but didin't worked out.
# if division == "BSS" and site == "Tanger":
# df["Activity"].fillna("BSS") and df["Division"].replace("Europe")
didn't worked out.
any suggestions?
CodePudding user response:
You have to make it a two steps process, and be sure what you want to get because the order is important. Suppose you have one row of your df as:
import pandas as pd
import numpy as np
df = pd.DataFrame({'entity':['BSS'],
'activity':[np.nan],
'division':['BSS'],
'site':['Tanger']})
Now, if you implement the double .loc
substitution, the order of the steps change the output:
df.loc[df['division'] == 'BSS', 'activity'] = 'BSS'
df.loc[df['site'] == 'Tanger', 'division'] = 'Europe'
output1:
entity BSS
activity BSS
division Europe
site Tanger
On the other hand
df.loc[df['site'] == 'Tanger', 'division'] = 'Europe'
df.loc[df['division'] == 'BSS', 'activity'] = 'BSS'
output2:
entity BSS
activity NaN
division Europe
site Tanger
CodePudding user response:
If I understand this should work:
import pandas as pd
my_df = pd.DataFrame({
'entity': ['BSS - Tanger'],
'activity': [''],
'division': ['BSS'],
'site': ['Tanger'],
})
# First part
def my_function(x):
if x == 'BSS':
return 'BSS'
return ''
my_df.loc[:,'activity'] = list(map(
lambda x: my_function(x),
my_df['division'].tolist()
))
# second part
def my_function_2(x, prev_value):
if x == 'Tanger':
return 'europe'
return prev_value
my_df.loc[:,'division'] = list(map(
lambda x, prev_value: my_function_2(x, prev_value),
my_df['site'].tolist(),
my_df['division'].tolist(),
))