I am trying to include the index column in a dataframe pandas reads from a .csv file, but when filtering with "usecols=" I am getting the following error message if I try to include the index column with the data:
'usecols' must either be list-like of all strings, all unicode, all integers or a callable.
Here is my .csv file data, and though the column has a name "Date" it is unable to see it since it's in the 3rd row:
and this is my code:
df = pd.read_csv('mydata.csv', index_col=0, nrows=0)
use_col = [col_name for col_name in list(df.columns) if col_name.startswith('M6A=F')] #filters columns
dti = [0]
df = pd.read_csv('mydata.csv', index_col=[0], usecols=dti use_col)
CAVEAT: I dont want to read the entire csv file into a dataframe, as this in a function being parsed to json via FastAPI and gets really slow that way, so needs to be filtered before reading the csv file, hence "usecols=" has to be used.
What I have tried so far: I looked at this stackoverflow solution but I think that works because its using integers in both cases, while mine is mixing integer and string, unless I have misunderstood the error.
I also tried pandas concat method reading the csv into two dataframes, once to get the index column and then the data, but couldnt seem to get them to join in a way that the index lined up with the data, and it felt like I was creating additional room for error in the output.
There is a suggestion here which I will try if I cannot get a better solution, but it feels like a dubious method and the OP was posted 8 years ago.
CodePudding user response:
For me working callable by function:
f = lambda x: 'M6A=F' in x or 'Unnamed: 0' in x
df = pd.read_csv('hist1day.csv', usecols=f, index_col=[0], skiprows=[2])
print(df)
M6A=F M6A=F.1 M6A=F.2 \
NaN Open High Low
2009-03-20 0.6830999851226807 0.6830999851226807 0.6830999851226807
2009-03-23 0.6995000243186951 0.6995000243186951 0.6995000243186951
2009-03-24 0.6998000144958496 0.7056999802589417 0.6912999749183655
2009-03-25 0.6941999793052673 0.7014999985694885 0.6894999742507935
... ... ...
2022-08-04 0.6949999928474426 0.699400007724762 0.6938999891281128
2022-08-05 0.6972000002861023 0.6978999972343445 0.6873999834060669
2022-08-08 0.6899999976158142 0.7013000249862671 0.6899999976158142
2022-08-09 0.6987000107765198 0.6998000144958496 0.6956999897956848
2022-08-10 0.7081999778747559 0.7091000080108643 0.7080000042915344
M6A=F.3 M6A=F.4 M6A=F.5
NaN Close Adj Close Volume
2009-03-20 0.6830999851226807 0.6830999851226807 0.0
2009-03-23 0.6995000243186951 0.6995000243186951 392.0
2009-03-24 0.6963000297546387 0.6963000297546387 588.0
2009-03-25 0.6901999711990356 0.6901999711990356 616.0
... ... ...
2022-08-04 0.6985999941825867 0.6985999941825867 6604.0
2022-08-05 0.691100001335144 0.691100001335144 9935.0
2022-08-08 0.6984000205993652 0.6984000205993652 8102.0
2022-08-09 0.695900022983551 0.695900022983551 8102.0
2022-08-10 0.7085000276565552 0.7085000276565552 305.0
[3373 rows x 6 columns]