I have the following dataframe :
0 1 2 ... 630 631 632
0 index MATRICULE ID_UEV ...
1 9936-25-3989-4-000-0000 9936-25-3989-4-000-0000 01045406 ...
2 9739-83-9737-8-001-0302 9739-83-9737-8-001-0302 01038232 ...
3 9754-37-9664-9-000-0000 9754-37-9664-9-000-0000 02004842 ...
4 8134-96-8810-1-000-0000 8134-96-8810-1-000-0000 04007065 ...
How can I remove the first row/index so I can have index MATRICULE, ID_UEV as a header
0 index MATRICULE ID_UEV ...
1 9936-25-3989-4-000-0000 9936-25-3989-4-000-0000 01045406 ...
2 9739-83-9737-8-001-0302 9739-83-9737-8-001-0302 01038232 ...
3 9754-37-9664-9-000-0000 9754-37-9664-9-000-0000 02004842 ...
4 8134-96-8810-1-000-0000 8134-96-8810-1-000-0000 04007065 ...
CodePudding user response:
You can use
df.columns = df.loc[0]
df = df.drop(0)
This sets the columns to the items in the first row, then drops the first row.
CodePudding user response:
If you are reading this data using pd.read_csv
or pd.read_excel
APIs, then it has a skiprows
argument which you can use to skip the line numbers.
import pandas as pd
df = pd.read_csv(
r"your_path",
skiprows=lambda x: x in [0], # Skip the first line
)
print(df)
CodePudding user response:
You can use it either way when reading the file:
Start header from the first index:
data = pd.read_csv("file.csv", header = 1)
Remove the first row
data = pd.read_csv("file.csv", skiprows=1)
CodePudding user response:
Alternative way is:
df.rename(columns=df.iloc[0]).drop(df.index[0])
or use this if don't want your index to be missing anything:
df.rename(columns=df.iloc[0]).drop(df.index[0]).reset_index(drop=True)