Home > Mobile >  python/pandas changing the values of columns conditinally according to plusier columns
python/pandas changing the values of columns conditinally according to plusier columns

Time:09-29

my df is something like this;

import pandas as pd
import numpy as np

df = pd.DataFrame({'entity':['BSS'],
                   'activity':[np.nan],
                   'division':['BSS'],
                   'site':['Tanger']})

i want to take the value of column 'division' which is 'BSS' and put it into column activity if only division is 'BSS'. I managed it by df.loc[df['Division'] == 'BSS', 'Activity'] = 'BSS'but while i am doing it i also want to insert the value 'europe' into column 'division' if only column 'site' is 'Tanger'

I tried if and else but didin't worked out.

# if division == "BSS" and site == "Tanger":
#    df["Activity"].fillna("BSS") and df["Division"].replace("Europe")

didn't worked out.

any suggestions?

CodePudding user response:

You have to make it a two steps process, and be sure what you want to get because the order is important. Suppose you have one row of your df as:

import pandas as pd
import numpy as np


df = pd.DataFrame({'entity':['BSS'],
                   'activity':[np.nan],
                   'division':['BSS'],
                   'site':['Tanger']})

Now, if you implement the double .loc substitution, the order of the steps change the output:

df.loc[df['division'] == 'BSS', 'activity'] = 'BSS'
df.loc[df['site'] == 'Tanger', 'division'] = 'Europe'

output1:

entity         BSS
activity       BSS
division    Europe
site        Tanger

On the other hand

df.loc[df['site'] == 'Tanger', 'division'] = 'Europe'
df.loc[df['division'] == 'BSS', 'activity'] = 'BSS'

output2:

entity         BSS
activity       NaN
division    Europe
site        Tanger

CodePudding user response:

If I understand this should work:

import pandas as pd
my_df = pd.DataFrame({
    'entity': ['BSS - Tanger'],
    'activity': [''],
    'division': ['BSS'],
    'site': ['Tanger'],
})

# First part
def my_function(x):
    if x == 'BSS':
        return 'BSS'
    return ''

my_df.loc[:,'activity'] = list(map(
    lambda x: my_function(x),
    my_df['division'].tolist()
))

# second part 
def my_function_2(x, prev_value):
    if x == 'Tanger':
        return 'europe'
    return prev_value

my_df.loc[:,'division'] = list(map(
    lambda x, prev_value: my_function_2(x, prev_value),
    my_df['site'].tolist(),
    my_df['division'].tolist(),
    
))

enter image description here

  • Related