Home > Blockchain >  Python - Writing to a dataframe within a groupby for loop
Python - Writing to a dataframe within a groupby for loop

Time:10-14

I can't work out why the dataframe "newTimeDF" I am adding to is empty at the end of the for loop:

timeZonesDF = pd.DataFrame{"timeZoneDate": [2018-03-11, 2018-11-04]}
newTimeDF = pd.DataFrame(columns = ["startDate", "endDate"])

for yearRow, yearData in timeZonesDF.groupby(pd.Grouper(freq="A")):
    DST_start = pd.to_datetime(yearData.iloc[0]["timeZoneDate"])
    DST_end = pd.to_datetime(yearData.iloc[-1]["timeZoneDate"])
    newTimeDF["startDate"] = DST_start
    newTimeDF["endDate"] = DST_end
    continue

Can someone please point out what I am missing, is there something about groupby for-loops which is different?

CodePudding user response:

The code you have here:

newTimeDF["startDate"] = DST_start
newTimeDF["endDate"] = DST_end

is setting the startDate column equal to DST_start for all rows and the endDate column equal to DST_end for all rows. because at the time of running this code your dataframe has no rows, nothing is changed in your final product.

What you could do is create a dictionary from your two values like so:

tempdic = {"startDate" : DST_start, "endDate" : DST_end} 

Then append that dictionary to your dataframe to add a row.

newTimeDF.append(tempdic, ignore_index=True)

Making your code look something like this

for yearRow, yearData in timeZonesDF.groupby(pd.Grouper(freq="A")):
    DST_start = pd.to_datetime(yearData.iloc[0]["timeZoneDate"])
    DST_end = pd.to_datetime(yearData.iloc[-1]["timeZoneDate"])
    tempdic = {"startDate" : DST_start, "endDate" : DST_end} 
    newTimeDF = newTimeDF.append(tempdic, ignore_index=True)
  • Related