I'm working with data from an excel file like this.
A B
2001-05-01 12:30 10
2001-05-01 12:30 20
2001-05-05 11:50 30
2001-05-05 11:50 40
2002-03-22 14:12 10
I'm using this line of code to eliminate the duplicates keeping the maximum
df_clean=df_raw.sort_values('A', ascending=False).drop_duplicates('B').sort_index()
but I'm obtaining this error
Index(['B'], dtype='object')
I don't know which could be the problem since I'm doing it after the upload of the file.
CodePudding user response:
If I can assume that your index is just a RangeIndex
then I think what you are looking for is:
df_clean=df_raw.sort_values('A', ascending=False).drop_duplicates('B', ignore_index=True)
and not sort_index()
CodePudding user response:
It seems to me that your second column name contains some spaces before "B" something like:
" B"
Just try :
df_raw.columns = ["A","B"]
before your statement