Home > OS >  How to change values within a column, based on a condition, in a DataFrame with multi-index
How to change values within a column, based on a condition, in a DataFrame with multi-index

Time:01-14

My current DF looks like:

                            column 1   column 2   column 3
      user_id      date
       5678     2022-01-01    0.0        1.5          0.0
       6253     2022-01-14    0.0        NaN          2.0

My DF has a lot of rows, and I need to change the value of column 2 based on whether the user_id is in a particular set called 'users'.

I am using the following code but it doesn't seem to be working.

My code:

for idx, row in df.iterrows():
  if idx[0] in users:
    row['column 2'] = 0  

When I checked against a particular user_id that exists within the 'users' set, it shows up as 'NaN'. Does this mean the code hasn't worked? I need all values of column 2 to be zero if the user_id exists in the users set.

Thank you in advance.

CodePudding user response:

df.loc[df.index.get_level_values("user_id").isin(users), "column 2"] = 0

You don't need the loop! You can

  • get a hold on the user_id level values in the index
  • check which of them are in the predefined "users" set
  • use that boolean mask as the row indexer and the column of interest "column 2" as the column one
    • then .loc will do the setting

CodePudding user response:

Here's how I solved this:

for user in users:
    if user in df.index.get_level_values(level='user_id'):
        df['column 2'].loc[user,:] = 0

Cycle will check every user. If they are in that index of dataframe, it will change a value in column 2 for that user. (loc works here)

Also that might work:

for user in users:
    if user in df.index.get_level_values(0):
        df['column 2'].loc[user,:] = 0
  • Related