I would like to remove certain columns of a dataframe and found it annoying typing all the column names. I would like to use columns index numbers instead of columns to remove the columns from the dataframe. The following code gives all the header names but how to get index numbers too?
import pandas as pd
df=pd.read_csv("https://raw.githubusercontent.com/codebasics/py/master/ML/14_naive_bayes/titanic.csv")
print(df.columns)
Expected Output:
'PassengerId', 0
'Name', 1
'Pclass', 2
'Sex', 3
'Age', 4
'SibSp', 5
'Parch', 6
'Ticket', 7
'Fare', 8
'Cabin', 9
'Embarked', 10
'Survived', 11
CodePudding user response:
Use pd.Series
with columns names:
s = pd.Series(df.columns)
print (s)
0 PassengerId
1 Name
2 Pclass
3 Sex
4 Age
5 SibSp
6 Parch
7 Ticket
8 Fare
9 Cabin
10 Embarked
11 Survived
dtype: object
Like mentioned @timgeb (thank you) is possible select by positions instead columns names:
df['Name']
df.iloc[:, 1]
If need remove first 2 columns use:
df = df.iloc[:, 2:]
CodePudding user response:
To print both index number and column name you can use enumerate
:
>>> list(enumerate(df.columns))