I have a column which outputs a date for every row. I want my code to scan the first row of that column to name my file with it. E.g. If in the column called "Date", the first row in that column gives this following output:
2022-03-25 8:00:00
I want to name my file "Quarter 1 2022" using that date as a reference.
So what I have is:
0 Date
1 2022-03-25 08:00:00
2 2022-03-25 08:00:00
3 2022-02-27 08:00:00
4 2022-01-01 08:00:00
5 2022-01-01 08:00:00
6 2022-01-01 08:00:00
I have come across using np.where. But I don't know how to associate the output to be read as the filename. My idea is that I use this code:
df.loc[:, "Date"].iloc[1] = np.where.df.loc[:, "Date"].iloc[1] \
<"2022-04-01", FilepathName,
FilepathName = "Quarter 1 2022"
Filepath = r'C:/Users/downloads/ ' FilepathName.csv
The first line isolates the row in the Date column, should the date be before 2022-04-01, then it creates a string variable called FilepathName. I insert this string variable into a filepath variable called Filepath and then finally save the dataframe as a csv into the new filepath with the new given name.
I am struggling to finish the code:
df.loc[:, "Date"].iloc[1] = np.where.df.loc[:, "Date"].iloc[1] \
<"2022-04-01", FilepathName,
so that I am able to save FilepathName as a string variable.
How can I complete this code?
CodePudding user response:
You can try using Pandas's builtin dt.quarter
function:
import pandas as pd
#### make OP's list of dates dataframe
Date = [ "2022-03-25 08:00:00", "2022-03-25 08:00:00", "2022-02-27 08:00:00", "2022-01-01 08:00:00", "2022-01-01 08:00:00", "2022-01-01 08:00:00"]
df = pd.DataFrame({"Date":Date})
# Convert them to date time objects with the correct formatting
df.Date = pd.to_datetime(df.Date, format='%Y/%m/%d')
# Sort by most recent date
df = df.sort_values(by="Date", ascending=False).reset_index(drop=True)
# Get the quarter and year and combine it
"Quarter {} {}".format(df.Date.dt.quarter[0], df.Date[0].year)
Output:
'Quarter 1 2022'
And then you can save it all like so:
# Get the quarter and year and combine it
FilepathName = "Quarter {} {}".format(df.Date.dt.quarter[0], df.Date[0].year)
Filepath = r'C:/Users/downloads/' FilepathName ".csv"
df.to_csv(Filepath, index=False)
CodePudding user response:
You can use pd.Timestamp()
try this:
FilepathName = 'Quarter {} {}'.format(pd.Timestamp(df['Date'][0]).quarter, pd.Timestamp(df['Date'][0]).year)
Output:
'Quarter 1 2022'