Home > Enterprise >  Change values in DataFrame - .iloc vs .loc
Change values in DataFrame - .iloc vs .loc

Time:12-02

Hey I write this code:

import pandas as pd
d1 = {"KEY": ["KEY1", "KEY2", "KEY3"], "value": ["A", "B", "C"]}
df1 = pd.DataFrame(d1)
df1["value 2"] = "nothing"

d2 = {"KEY": ["KEY2"], "value_alternative": ["D"]}
df2 = pd.DataFrame(d2)

for k in range(3):
    key = df1.iloc[k]["KEY"]
    print(key)
    if key in list(df2["KEY"]):
        df1.iloc[k]["value 2"] = df2.loc[df2["KEY"] == key, "value_alternative"].item()
    else: 
        df1.iloc[k]["value 2"] = df1.iloc[k]["value"]

but unfortunately values in df1["value 2"] haven't changed :( I rewrite it as follows:

import pandas as pd
d1 = {"KEY": ["KEY1", "KEY2", "KEY3"], "value": ["A", "B", "C"]}
df1 = pd.DataFrame(d1)
df1["value 2"] = "nothing"

d2 = {"KEY": ["KEY2"], "value_alternative": ["D"]}
df2 = pd.DataFrame(d2)

for k in range(3):
    key = df1.iloc[k]["KEY"]
    print(key)
    if key in list(df2["KEY"]):
        df1.loc[k, "value 2"] = df2.loc[df2["KEY"] == key, "value_alternative"].item()
    else: 
        df1.loc[k, "value 2"] = df1.iloc[k]["value"]

and then everything works fine, but I dont understand why the previous method don't work. What is the easiest way to change value in dataframe in a loop?

CodePudding user response:

First of all. Don't use a for loop with dataframes if you really really have to. Just use a boolean array to filter your dataframe with loc and assign your values that way. You can do what you want with a simple merge.

df1 = df1.merge(df2, on='KEY', how='left').rename(columns={'value_alternative': 'value 2'})
df1.loc[df1['value 2'].isna(), 'value 2'] = df1['value']

Reason for iloc not working with assignment is in pandas you can't set a value in a copy of a dataframe. Pandas does this in order to work fast. To have access to the underlying data you need to use loc for filtering. Don't forget loc and iloc do different things. loc looks at the lables of the index while iloc looks at the index number.

In order for this to work you also have to delete the

df1["value 2"] = "nothing"

line from your program

  • Related