USECOLS in pandas.read_excel is sorting my list in ascending order of column value.
I have a list.
This list is not constant and will vary across the code.
indexList = [3,6,5,18,19]
Once I pass this list to pandas.read_excel, USECOLS is sorting my output dataframe in ascending order : [3,5,6,18,19]
I need this in same order as I passed it, ie, [3,6,5,18,19]
Indexlist = [3,6,5,18,19]
#inside a function to which Indexlist is passed as a parameter
excel_data_df = pandas.read_excel('MasterFiltered.xlsx',
sheet_name='Sheet1',
usecols=indexList, #indexList is a variable List
dtype={'school: str,
'class': str,
'name': str,
'subject': str,
'school': str}) # datatype -- line X
df_filtered = excel_data_df.loc[excel_data_df["cityname"]==i_city_name]# Row Filtering
print(df_filtered) # Output is sorted excel ascending order of columns - [3,5,6,18,19]
Note: On looking up in Stack the answers mentions to use column names along with column index but that is not possible in my case, as columns will keep varying.
A very similar question/answer I found is : Keeping columns in the specified order when using UseCols in Pandas Read_CSV
But when I give [indexList] at end of the line X similar to as mentioned in above answer I get errors : raise KeyError(f"None of [{key}] are in the [{axis_name}]")
Kindly help
CodePudding user response:
Once you have read the file, use np.argsort
to reorder your columns:
import pandas as pd
import numpy as np
Indexlist = [3, 6, 5, 18, 19]
df = pd.read_excel('data.xlsx', usecols=Indexlist)
print(df)
out = df.iloc[:, np.argsort(Indexlist)]
print(out)
# Output of df:
D F G S T
0 3 5 6 18 19
# Output of out:
D G F S T
0 3 6 5 18 19
CodePudding user response:
Try this:
Indexlist = [3,2]
col_names=pd.read_excel('data.xlsx', nrows=1).columns
df = pd.read_excel('data.xlsx', usecols=Indexlist)
df = df[col_names[Indexlist]]