I am trying to remove specific rows from the dataset and find the average of a specific column after the rows are removed without changing the original dataset
import pandas as PD
import NumPy as np
df = PD.read_csv(r"C:\Users\User\Downloads\nba.CSV")
NBA = PD.read_csv(r"C:\Users\User\Downloads\nba.CSV")
NBA.drop([25,72,63],axis=0)
I NEED TO FIND THE AVERAGE OF A SPECIFIC COLUMN LIKE "AGE"
HOWEVER THIS ISNT WORKING: Nba.drop([25,72,63],axis=0),['Age'].mean()
NEITHER IS THE QUERY COMMAND OR THE. LOC COMMAND
CodePudding user response:
Your code to drop the rows is correct.
NBA_clean = NBA.drop([25,72,63],axis=0)
will give you a new dataframe with some rows removed.
To find the average of a specific column, you can use index notation, which will return a series containing that specific row:
NBA_Age = NBA_clean["Age"]
Finally, to return the mean, you simply call the mean()
method with:
NBA_mean_age = NBA_Age.mean()
It is not clear what the specific mistake in your code is, but I will present two possibilities:
- You are not saving the result of
NBA.drop([25,72,63],axis=0)
into a variable. This operation is not done in place, if you want to do it in place you must use theinplace=True
argument forNBA.drop([25,72,63], axis=0, inplace=True)
. - There is an unnecessary comma in
Nba.drop([25,72,63],axis=0),['Age'].mean()
. Remove this to get the correct syntaxNba.drop([25,72,63],axis=0)['Age'].mean()
. I suspect the error message obtained when running this code would have hinted at the unnecessary comma.
CodePudding user response:
can you try this? I think there was a typo in your code
Nba.drop([25,72,63],axis=0)['Age'].mean()