My current DF looks like:
column 1 column 2 column 3
user_id date
5678 2022-01-01 0.0 1.5 0.0
6253 2022-01-14 0.0 NaN 2.0
My DF has a lot of rows, and I need to change the value of column 2 based on whether the user_id is in a particular set called 'users'.
I am using the following code but it doesn't seem to be working.
My code:
for idx, row in df.iterrows():
if idx[0] in users:
row['column 2'] = 0
When I checked against a particular user_id that exists within the 'users' set, it shows up as 'NaN'. Does this mean the code hasn't worked? I need all values of column 2 to be zero if the user_id exists in the users set.
Thank you in advance.
CodePudding user response:
df.loc[df.index.get_level_values("user_id").isin(users), "column 2"] = 0
You don't need the loop! You can
- get a hold on the user_id level values in the index
- check which of them are in the predefined "users" set
- use that boolean mask as the row indexer and the column of interest "column 2" as the column one
- then
.loc
will do the setting
- then
CodePudding user response:
Here's how I solved this:
for user in users:
if user in df.index.get_level_values(level='user_id'):
df['column 2'].loc[user,:] = 0
Cycle will check every user. If they are in that index of dataframe, it will change a value in column 2 for that user. (loc works here)
Also that might work:
for user in users:
if user in df.index.get_level_values(0):
df['column 2'].loc[user,:] = 0