I was wondering how I could group data by month so I can look at the data on a per month basis. How would i do that?
for example, assign january for all data recorded in january in their own dataframe for analysis, etc.
here is my current dataframe:
WC_Humidity[%] WC_Htgsetp[C] WC_Clgsetp[C] Date Time
0 55.553640 18 26 2005-01-01 00:10
1 54.204342 18 26 2005-01-01 00:20
2 51.896272 18 26 2005-01-01 00:30
3 49.007770 18 26 2005-01-01 00:40
4 45.825810 18 26 2005-01-01 00:50
help is much appreciated.
CodePudding user response:
Try this:
df1['Date'].to_numpy().astype('datetime64[M]')
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>
CodePudding user response:
You can convert your column with following code.
df1['Date'] = pd.to_datetime(df["Date"].dt.strftime('%d-%m-%Y'))
And you can refer official document for why dayfirst is not working. https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html
CodePudding user response:
If the column is in this format 2021-01-29, 30-12-2024 then before the above line should take care of it and parse it accordingly.
pd.to_datetime(df1['Date'])
Now you can use this code to convert the date column to the way you wanted.
df1['Date'] = df['Date1'].dt.strftime('%d/%m/%Y')
This should get you what you want.
CodePudding user response:
If you have strings like 2005-01-01
then you can get
df['year-month'] = df['Date'].str[:7]
and later you can use
df.groupby('year-month')
Minimal working code.
I changed dates to have different months in data.
I use io
only to simulate file in memory.
text = '''WC_Humidity[%],WC_Htgsetp[C],WC_Clgsetp[C],Date,Time
55.553640,18,26,2005-01-01,00:10
54.204342,18,26,2005-01-01,00:20
51.896272,18,26,2005-02-01,00:30
49.007770,18,26,2005-02-01,00:40
45.825810,18,26,2005-03-01,00:50
'''
import pandas as pd
import io
df = pd.read_csv(io.StringIO(text))
df['year-month'] = df['Date'].str[:7]
print(df)
for value, group in df.groupby('year-month'):
print()
print('---', value, '---')
print(group)
print()
print('average WC_Humidity[%]:', group['WC_Humidity[%]'].mean())
Result:
WC_Humidity[%] WC_Htgsetp[C] WC_Clgsetp[C] Date Time year-month
0 55.553640 18 26 2005-01-01 00:10 2005-01
1 54.204342 18 26 2005-01-01 00:20 2005-01
2 51.896272 18 26 2005-02-01 00:30 2005-02
3 49.007770 18 26 2005-02-01 00:40 2005-02
4 45.825810 18 26 2005-03-01 00:50 2005-03
--- 2005-01 ---
WC_Humidity[%] WC_Htgsetp[C] WC_Clgsetp[C] Date Time year-month
0 55.553640 18 26 2005-01-01 00:10 2005-01
1 54.204342 18 26 2005-01-01 00:20 2005-01
average WC_Humidity[%]: 54.878991
--- 2005-02 ---
WC_Humidity[%] WC_Htgsetp[C] WC_Clgsetp[C] Date Time year-month
2 51.896272 18 26 2005-02-01 00:30 2005-02
3 49.007770 18 26 2005-02-01 00:40 2005-02
average WC_Humidity[%]: 50.452021
--- 2005-03 ---
WC_Humidity[%] WC_Htgsetp[C] WC_Clgsetp[C] Date Time year-month
4 45.82581 18 26 2005-03-01 00:50 2005-03
average WC_Humidity[%]: 45.82581
If you have objects datetime
then you can do
df['year-month'] = df['Date'].dt.strftime('%Y-%m')
and rest is the same
text = '''WC_Humidity[%],WC_Htgsetp[C],WC_Clgsetp[C],Date,Time
55.553640,18,26,2005-01-01,00:10
54.204342,18,26,2005-01-01,00:20
51.896272,18,26,2005-02-01,00:30
49.007770,18,26,2005-02-01,00:40
45.825810,18,26,2005-03-01,00:50
'''
import pandas as pd
import io
df = pd.read_csv(io.StringIO(text))
# create datetime objects
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d')
df['year-month'] = df['Date'].dt.strftime('%Y-%m')
print(df)
for value, group in df.groupby('year-month'):
print()
print('---', value, '---')
print(group)
print()
print('average WC_Humidity[%]:', group['WC_Humidity[%]'].mean())