Home > database >  Understanding the loc command
Understanding the loc command

Time:08-31

I have the following command which I do not understand completely

def func1(df):
    df.loc[df['First Name'].isin(['n.a.', 'null']),
            df.columns.drop(["Last Name", "Middle Name", "First Name"])] = "n.a."

    return df

I wrote the following dataframe to test what it returns

    df1 = pd.DataFrame({"First Name": ['Alex', 'n.a.', 'null'],
                        "Last Name": ['Peterson', 'Doe', 8],
                        "Middle Name": ['John', 'Jack', 3],
                         "Pet": [2, 9, 3]})

If I understood correctly, it checks in the column 'First Name' if the value is 'n.a.' or 'null' and then it drops all the other columns except the "Last Name", "Middle Name", "First Name"? But what is the equal to n.a. at the end? By running the function on the aforementioned dataframe it basically return the same dataframe without changing anything. For this reason I tried to split the function to check it separately

def func2(df):
    df.loc[df['First Name'].isin(['n.a.', 'null'])] = "n.a."
    return df

tried it with the same dataframe and I noticed that for the rows that have n.a. or null in the First Name, it turns the other elements into n.a. Why my dataframe does not change the same way for func1?

CodePudding user response:

First, the provided DataFrame is invalid, all lists must have the same length.

Let us use:

df1 = pd.DataFrame({"First Name": ['Alex', 'n.a.', 'null'],
                    "Last Name": ['Peterson', 'Doe', 8],
                    "Middle Name": ['John', 'Jack', 3],
                    "Pet": [2, 9, 3]}) # removed 2 values

Now, the function selects all rows in which "First Name" is ['n.a.', 'null'] and all columns that are not ["Last Name", "Middle Name", "First Name"]. This leaves us with rows 1/2 and column "Pet". Assignment makes them "n.a.", and indeed the output is:

  First Name Last Name Middle Name   Pet
0       Alex  Peterson        John     2
1       n.a.       Doe        Jack  n.a. # this value was updated
2       null         8           3  n.a. # this value was updated
  • Related