Home > Blockchain >  Why does loc[[]] not work for a single column?
Why does loc[[]] not work for a single column?

Time:01-20

In the official documentation for the loc function in pandas, it is written that using double brackets, loc[[]], returns a dataframe.

Single tuple. Note using [[]] returns a DataFrame. enter image description here

In the last case, it seems that using double brackets on a single column gives a syntax error. I do not understand this as using a single bracked worked for a single column, and using double brackets worked for a single row. Can someone explain to me why this happens?

Thanks in advance

CodePudding user response:

Use : for selecting all rows and [] for get one column to one column DataFrame:

print (df.loc[:, ['max_speed']])
            max_speed
cobra               1
viper               4
sidewinder          7

what is alternative:

print (df[['max_speed']])
            max_speed
cobra               1
viper               4
sidewinder          7

CodePudding user response:

In Pandas, the .loc[] accessor is used to select rows and columns from a DataFrame based on label(s) rather than index position(s). When you use .loc[] with a single column, it expects the input to be a DataFrame with a single column. When you pass in [[]] as the argument, it is interpreted as an empty DataFrame rather than a single column.

For example, the following will select a single column 'A' from a DataFrame df:

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
print(df.loc[:, 'A'])

This will return:

0    1
1    2
2    3
Name: A, dtype: int64

To select a single column, you can simply use the column name without [[]] brackets:

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
print(df.A)

If you want to select a single column but return it as a DataFrame, you can use the [] operator and pass in the column name as a string:

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
print(df[['A']])

The .loc[] method expects a single column to be passed in without [[]] brackets, and if you want to return the selected column as a DataFrame you can use the [] operator with the column name as a string.

CodePudding user response:

It appears that you wrongly parsed what .loc[['x']] means. The double brackets are no special "dataframe operator" ([[ 'x' ]]). Rather, you are passing a list of one or more row names into single brackets ([ ['x'] ]). If you pass only one argument, it will be interpreted as row labels and the column labels are implicitly set to : (= all of them). If you wanna set column labels to something else, you need to add another parameter to .loc[], not to .loc[[]], i.e., .loc[<row_labels>, <col_labels>], where the parameters can be either lists or strings.

As a rule of thumb, the number of lists (including the shortcut :) you pass to loc gives you the number of dimensions of the output: Pass two lists, you get a dataframe (2D). (Note that if you only pass a single list and nothing else, it is implicitly [your_list, :] == two lists). Pass one list, one string, you get a a series (1D). Pass two strings, you get a 0D object, or a scalar, i.e., the value in the cell.

  • Related