Home > database >  I'm having a hard time understanding what pandas.DataFrame.loc does in this line of code
I'm having a hard time understanding what pandas.DataFrame.loc does in this line of code

Time:03-26

I'm tracing a basic machine learning python code to learn the basics and i stumbled upon these two lines of code:

data.loc[:,'symboling'] = data['symboling'].astype('object')
data.rename(columns={'symboling':'riskScore'},inplace=True)

first of all "data" is the csv file after being read with panda.read_csv and "symboling" is one of the column labels (the first one).

I understood the second line from the Pandas docs but I dont get what the first line does at all.

CodePudding user response:

The first line converts the symboling column to object and replaces it in the dataframe.

On the left-hand side, data.loc[:,'symboling'] selects all rows (the : part is a slice) and the symboling column.

loc is probably being used here to avoid a SettingWithCopy warning, which might occur if the author had written:

data['symboling'] = data['symboling'].astype('object')

See also: What is the difference between using loc and using just square brackets to filter for columns in Pandas/Python?

  • Related