Hey I write this code:
import pandas as pd
d1 = {"KEY": ["KEY1", "KEY2", "KEY3"], "value": ["A", "B", "C"]}
df1 = pd.DataFrame(d1)
df1["value 2"] = "nothing"
d2 = {"KEY": ["KEY2"], "value_alternative": ["D"]}
df2 = pd.DataFrame(d2)
for k in range(3):
key = df1.iloc[k]["KEY"]
print(key)
if key in list(df2["KEY"]):
df1.iloc[k]["value 2"] = df2.loc[df2["KEY"] == key, "value_alternative"].item()
else:
df1.iloc[k]["value 2"] = df1.iloc[k]["value"]
but unfortunately values in df1["value 2"]
haven't changed :( I rewrite it as follows:
import pandas as pd
d1 = {"KEY": ["KEY1", "KEY2", "KEY3"], "value": ["A", "B", "C"]}
df1 = pd.DataFrame(d1)
df1["value 2"] = "nothing"
d2 = {"KEY": ["KEY2"], "value_alternative": ["D"]}
df2 = pd.DataFrame(d2)
for k in range(3):
key = df1.iloc[k]["KEY"]
print(key)
if key in list(df2["KEY"]):
df1.loc[k, "value 2"] = df2.loc[df2["KEY"] == key, "value_alternative"].item()
else:
df1.loc[k, "value 2"] = df1.iloc[k]["value"]
and then everything works fine, but I dont understand why the previous method don't work. What is the easiest way to change value in dataframe in a loop?
CodePudding user response:
First of all. Don't use a for loop with dataframes if you really really have to.
Just use a boolean array to filter your dataframe with loc
and assign your values that way.
You can do what you want with a simple merge.
df1 = df1.merge(df2, on='KEY', how='left').rename(columns={'value_alternative': 'value 2'})
df1.loc[df1['value 2'].isna(), 'value 2'] = df1['value']
Reason for iloc
not working with assignment is in pandas you can't set a value in a copy of a dataframe. Pandas does this in order to work fast. To have access to the underlying data you need to use loc
for filtering. Don't forget loc
and iloc
do different things. loc
looks at the lables of the index while iloc
looks at the index number.
In order for this to work you also have to delete the
df1["value 2"] = "nothing"
line from your program