I can't work out why the dataframe "newTimeDF" I am adding to is empty at the end of the for loop:
timeZonesDF = pd.DataFrame{"timeZoneDate": [2018-03-11, 2018-11-04]}
newTimeDF = pd.DataFrame(columns = ["startDate", "endDate"])
for yearRow, yearData in timeZonesDF.groupby(pd.Grouper(freq="A")):
DST_start = pd.to_datetime(yearData.iloc[0]["timeZoneDate"])
DST_end = pd.to_datetime(yearData.iloc[-1]["timeZoneDate"])
newTimeDF["startDate"] = DST_start
newTimeDF["endDate"] = DST_end
continue
Can someone please point out what I am missing, is there something about groupby for-loops which is different?
CodePudding user response:
The code you have here:
newTimeDF["startDate"] = DST_start
newTimeDF["endDate"] = DST_end
is setting the startDate column equal to DST_start for all rows and the endDate column equal to DST_end for all rows. because at the time of running this code your dataframe has no rows, nothing is changed in your final product.
What you could do is create a dictionary from your two values like so:
tempdic = {"startDate" : DST_start, "endDate" : DST_end}
Then append that dictionary to your dataframe to add a row.
newTimeDF.append(tempdic, ignore_index=True)
Making your code look something like this
for yearRow, yearData in timeZonesDF.groupby(pd.Grouper(freq="A")):
DST_start = pd.to_datetime(yearData.iloc[0]["timeZoneDate"])
DST_end = pd.to_datetime(yearData.iloc[-1]["timeZoneDate"])
tempdic = {"startDate" : DST_start, "endDate" : DST_end}
newTimeDF = newTimeDF.append(tempdic, ignore_index=True)