Home > Blockchain >  How to drop column where you don't know the name of the column?
How to drop column where you don't know the name of the column?

Time:11-24

I'm a beginner and I'm wondering about this.

For example I have this code:

df = example.get_data

And I only know that the the header will be a date numpy.datetime64 type. How can I only keep the last 2 years data without knowing anything more about it?

I tried something like this:

df.drop(df.columns.year >= date.today().year-2, axis=1, inplace = True

But it's not working. Any suggestions?

CodePudding user response:

If your column names are e.g. '12/02/2021', '14/01/2021', '19/08/2019' you can select all columns of the last two years like that:

from pandas.tseries.offsets import DateOffset

last_2_years = [c for c in df.columns if pd.to_datetime(c) > pd.Timestamp.today() - DateOffset(years=2)]
df = df[last_2_years]

It's usually easier to select the columns you want to keep than to drop the columns you don't need, but you can of course also do

cols_to_drop = [c for c in df.columns if pd.to_datetime(c) < pd.Timestamp.today()-DateOffset(years=2)]
df = df.drop(cols_to_drop, axis=1)
  • Related