Home > OS >  Pandas can't select index range as string date
Pandas can't select index range as string date

Time:04-16

csv.index.values contains the below data

[263 rows x 13 columns]>
['31/01/1997' '28/02/1997' '31/03/1997' '30/04/1997' '31/05/1997'...
csv.loc['01/01/2009':'31/12/2015']
    raise KeyError(key) from err
KeyError: '01/01/2009'

Why can't I select a range?

CodePudding user response:

.loc works with dataframes, but you need to use it in an array.

In other words, csv.loc.index.values must return an array, so that you can use it with .loc : from pandas.DataFrame.loc documentation : Access a group of rows and columns by label(s) or a boolean array.

I would recomend you to first load your csv file, transfer it to dataframe, set the desired column date as index (watch out for inplace = true) and finally apply .loc

Here an example:

# creating some data
new_list= np.arange(20) # creating some data
BlackHoles = np.random.randint(0,2,20)

#converting data to dataframe
df=pd.DataFrame({'date' : new_list, 'BlackHoles' : BlackHoles})
#converting new_list to Timestamp
df['date']=pd.to_numeric(df['date'])
df['date']=pd.to_datetime(df['date'],dayfirst=True,unit='D',origin='15.04.2022')

#setting the desired column as index ( this point is missing in your code)
df_test=df.set_index('date')

df_test.index.values # will return as: array(['.....],],dtype='datetime64[ns]')
df_test.index # will give you the possibilites of "indexes" that can be called
df_test.loc['2022-04-16': '2022-04-25']

Be advised, since you are working with Timestamps, be sure to eliminate duplicates, or you will get the unique errors. In this case I suggest you to use the groupby in your date column.

I hope I could assist on this one.

CodePudding user response:

The index dates were strings instead of datetime objects -- needed to convert.

csv = csv.to_datetime(csv.index)
  • Related