I have created a timestamp of 10 days of hourly data and then create 2 random columns with random values.
import pandas as pd
import numpy as np
timestamp = pd.date_range('01/01/2020 00:00','01/10/2020 23:00', freq = '1H')
arr1 = np.random.randint(100,200,(len(timestamp)))
arr2 = np.random.randint(100,200,(len(timestamp)))
the created the dataframe
of these two arrays
with col1
and col2
as columns and with timestamp
as an index.
dic1 = {'col1':arr1,'col2':arr2}
df = pd.DataFrame(dic1, index=timestamp)
After that, I used the pandas groupby
function to group them by days
and I wanted to create the excel output of each group ( 10 excel outputs for 10 days).
df2 = df.groupby(pd.Grouper(freq='D'))
for idx, i in enumerate(df2):
df = pd.Dataframe(i)
df.to_csv(str(idx) '.csv')
but this gives out very strange output, 10 excel files are there but no values in them. Can someone please help? Desired output for day 1:
similarly, for every day this type of excel should be generated.
CodePudding user response:
When you do for idx, i in enumerate(df2)
, each i
is a pair group_key, data
. Try:
for g, data in df.groupby(...):
data.to_csv(str(g) '.csv')
If you really want enumerate:
for idx, (g, data) in enumerate(df.groupby(...)):
data.to_csv(str(idx) '.csv')
CodePudding user response:
If you iterate a .groupby
, you get a 2-tuple where the first element is the group key and the second one the group data, so you need something like:
df2 = df.groupby(pd.Grouper(freq='D'))
for idx, (_, sdf) in enumerate(df2):
sdf.to_csv(str(idx) '.csv')