I have a CSV file of following form:
Date | Data 1 | Data 2 | ... | Data n |
---|---|---|---|---|
2010-01-02 | 123 | 222 | ... | 223 |
2010-01-03 | 124 | 232 | ... | 233 |
... | ... | ... | ... | ... |
2021-11-06 | 424 | 332 | ... | 133 |
I want to read all lines of this table into a Pandas dataframe where the column date is less than a given date, say 2010-01-05.
I just tried the following code:
df = pd.read_csv('test.csv')
df["Date"] = pd.to_datetime(daten["Date"], format="%Y-%m-%d")
df.drop(df["Date"] >= "2010-01-05", axis=0, inplace=True)
daten.set_index("Date", axis=0, inplace=True)
This gives me a key error
KeyError: '[ True True False ... False False False] not found in axis'
What is the right way to solve this problem?
CodePudding user response:
drop
method need an Index or column labels to drop not the rows themselves.
You can choose to keep rows that match condition:
df = pd.read_csv('test.csv', parse_dates=['Date'])
df = df[df['Date'] < "2010-01-05"]
Output:
>>> df
Date Data 1 Data 2 Data n
0 2010-01-02 123 222 223
1 2010-01-03 124 232 233
Or if you prefer use drop
like this:
df = pd.read_csv('test.csv', parse_dates='Date')
df.drop(df[df["Date"] >= "2010-01-05"].index, inplace=True)