Home > database >  change multiple dataframe columns based on list of users
change multiple dataframe columns based on list of users

Time:12-06

Let's say I have this dataframe in pandas:

data = {'user':['Apple', 'One', 'Tom'], 
        'col_1':[1, 2, 3],
        'col_2':[4, 5, 6],
        'col_3':[7, 8, 9]
}

df = pd.DataFrame(data)

Which outputs:

user col_1 col_2 col_3
Apple 1 2 3
One 4 5 6
Tom 7 8 9

Now, here is the deal: I want to modify the value of columns col_1, col_2 and col_3 if the user is within a specific list of users. Otherwise, I have to keep the current value of the columns.

My proposal:

First, I created a dictionary, where the keys show the users that need modifications. Also, for each user, I created a list of the new values (col_1, col_2 and col_3, respectively) This way:

my_dict = {
    'one'  : ['col_1_value_one', 'col_2_value_one', 'col_3_value_one'],
    'two'  : ['col_1_value_two', 'col_2_value_two', 'col_3_value_two'],
    'apple': ['col_1_value_apple', 'col_2_value_apple', 'col_3_value_apple']
}

Then, I created a list based on the dict keys:

list_users = list(my_dict.keys())

Finally, I was able to verify whether the user was in fact within the list of keys or not by using np.where, containsand some regex (that is why I joined the list with "|" and used lower()). It works properly when keeping the "old" value if the user is not present in the list.

df['col_1'] = np.where(df['user'].str.lower().str.contains('|'.join(list_users), na=False, regex=True), 
                                        "<I need to replace this!>", 
                                        df['col_1'])
df['col_2'] = np.where(df['user'].str.lower().str.contains('|'.join(list_users), na=False, regex=True), 
                                        "<I need to replace this!>", 
                                        df['col_2'])
df['col_3'] = np.where(df['user'].str.lower().str.contains('|'.join(list_users), na=False, regex=True), 
                                        "<I need to replace this!>", 
                                        df['col_3'])    

However, as you can see I couldn't find a way to replace the values for the ones present in the dictionary values if the user is in fact within the list. For example, if the user is "Apple", then the new values for col_1, col_2 and col_3 are:

  • col_1_value_apple
  • col_2_value_apple
  • col_3_value_apple

Expected output based on example:

user col_1 col_2 col_3
Apple col_1_value_apple col_2_value_apple col_3_value_apple
One col_1_value_one col_2_value_one col_3_value_one
Tom 7 8 9

Here, both "Apple" and "One" users are modified, whilst "Tom" is not in the list, therefore keeping their original values.

I am sure this is not the optimal way to solve this, but any help would be highly appreciated!! Thanks!

CodePudding user response:

This works:

for name, cols in my_dict.items():
    df.loc[df['user'].str.lower() == name, ['col_1', 'col_2', 'col_3']] = cols
  • Related