So, iam trying to add headers to a dataframe without removing the first row.
This is the dataframe
01/02/2022 Lorem 369,02
0 01/02/2022 Lorem 374,12
1 01/02/2022 Lorem 1149,49
When i try to use df.columns, it removes the first row and return this
Date Description Value
0 01/02/2022 Lorem 374,12
1 01/02/2022 Lorem 1149,49
I also tried using df.MultiIndex but with multi index it gives me trouble when a try adding a column to a list giving the error "AttributeError: 'DataFrame' object has no attribute 'to_list'"
Tried as well using the df.iloc but i cant get it to work with df.query
df.query('df.iloc[2] in @list_difference')
And it returns an error as well
Now to explain what iam trying to do is, to compare values from two columns from diferents Dataframes and create an .xlsx with the rows that have the values that are present on the first DataFrame but not in the second. And Iam doing this by putting the specific columns into lists and iterating through them with an For loop, and then using a df.query to filter the lists with the values stored on "list_difference"
CodePudding user response:
As @Ynjxsjmh commented best would be to set header=None
when you read in the table with pd.read_csv
.
If you have already read in the table here is a hacky way to do what you need which transposes the table twice to make use of reset_index
.
import pandas as pd
#create example df
df = pd.DataFrame({1:[2,3,4],'A':['B','C','D']})
print(df)
# 1 A
#0 2 B
#1 3 C
#2 4 D
#transpose, reset_index, transpose back, reset_index again
df = df.T.reset_index().T.reset_index(drop=True)
df.columns = ['nums', 'letters'] #rename the columns
print(df)
# nums letters
#0 1 A
#1 2 B
#2 3 C
#3 4 D
CodePudding user response:
You can set a custom column headers while reading CSV data.
df = pd.read_csv(csv_file, names = ['Date', 'Description', 'Value'])