I am trying to recover the original date or index from grouping a times series with a datetime index by year. Is there a faster way without a loop and an extra column to obtain first_day_indices
import pandas as pd
import numpy as np
import datetime as dt
# Data
T = 1000
base = dt.date.today()
date_list = [base - dt.timedelta(weeks=x) for x in range(T)]
date_list.reverse()
test_data = pd.DataFrame(np.random.randn(T)/100, columns=['Col1'])
test_data.index = pd.to_datetime(date_list)
test_data['date'] = test_data.index
first_days = test_data['date'].groupby(test_data.index.year).first()
first_day_indices= []
for i in first_days:
first_day_indices.append(np.where(test_data.index == i)[0][0])
print(first_day_indices)
CodePudding user response:
You can use pandas.Series.isin
to check whether elements in Series are contained in a list of values.
test_data.reset_index()[test_data.index.isin(first_days)].index.tolist()