Can someone please help with the following question: When I use a group by, the result I get a time_counts frame, but the new grouped column has no name. How do I give it a name?
import pandas as pd
def read_data():
df = pd.read_csv("test.csv", usecols=['time', 'unix_time', 'name'])
df['time'] = pd.to_datetime(df['time'])
df['unix_time'] = (df['unix_time']).astype(int)
time_counts = df.groupby(df['time'].dt.floor('S'))['time'].count()
print(time_counts)
if __name__ == "__main__":
read_data()
output is:
time
2022-12-15 08:00:18 1
2022-12-15 08:07:17 1
2022-12-15 08:12:09 1
2022-12-15 08:12:19 1
2022-12-15 08:13:04 1
desired output is :
time count
2022-12-15 08:00:18 1
2022-12-15 08:07:17 1
2022-12-15 08:12:09 1
2022-12-15 08:12:19 1
2022-12-15 08:13:04 1
data in csv is :
time unix_time name
2022-12-15 08:00:18.034 1671091218034 apple
2022-12-15 08:07:17.376 1671091637376 apple
2022-12-15 08:12:09.648 1671091929648 apple
2022-12-15 08:12:19.320 1671091939320 apple
2022-12-15 08:13:04.623 1671091984623 apple
CodePudding user response:
You could do it like this,I did no changes except this line:
time_counts = df.groupby(df['time'].dt.floor('S'))['time'].count().reset_index(name='count')
time count
0 2022-12-15 08:00:18 1
1 2022-12-15 08:07:17 1
2 2022-12-15 08:12:09 1
3 2022-12-15 08:12:19 1
4 2022-12-15 08:13:04 1
Note that the output you created was a pd.Series
, with my added content time_counts
is a pd.DataFrame
.
CodePudding user response:
Using .agg() you can set column name and aggregate:
time_counts = df.groupby(df["time"].dt.floor("S")).agg(count=("time", "count")).reset_index()
print(time_counts)
Output:
time count
0 2022-12-15 08:00:18 1
1 2022-12-15 08:07:17 1
2 2022-12-15 08:12:09 1
3 2022-12-15 08:12:19 1
4 2022-12-15 08:13:04 1