Home > Software design >  how to choose columns when calculating mean
how to choose columns when calculating mean

Time:04-25

Hi I'm a student learning python.

what's the difference between

df.1.mean()
df[1].mean()

?

the full code is

df= pd.DataFrame(np.random.randn(10,4)) 
df[1].mean()

I'm confused because I used the first method to choose a column in a different data frame before.

CodePudding user response:

For columns that are numbers, it will result in a error if you call df.1.

You can use it however for column names that are string.

# create new column with string column name
df['new_col'] = np.random.randn()
# get mean 
df.new_col.mean()

CodePudding user response:

If the name of the column is a string such as "one" then it will work, as df.one is the attribute "one" of the df. Unfortunately the attribute syntax does not work pure integers (numbers) and only can be called by the squared brackets as df[1] where they are handled correctly.

df = pd.DataFrame({1:[2,3], 'one':[3,5]})
df.one #works
#df.1 # syntax error
  • Related