I need to create a function called as convert_to_qtr() that converts monthly values in the month value of data frame into quarters. Given below is the month data frame below:-
In the convert_to_qtr() function, we should use the following if conditions:- • If the month input is Jan-Mar, then the function returns “Q1” • If the month input is Apr-Jun, then the function returns “Q2” • If the month input is Jul-Sep, then the function returns “Q3” • If the month input is Oct-Dec, then the function returns “Q4”
Then this function should be applied to Month Dataframe provided above and a new column called as Quarter should be created that contains the quarter of each observations of months(January, Feb) etc it is aligned to .
quarter = 0
excl_merged['quarter'] = excl_merged[quarter]
excl_merged
def convert_to_quarterly(excl_merged):
if excl_merged['Month'] == 'January' & excl_merged['Month'] == 'February' & excl_merged['Month'] == 'March':
print(excl_merged[quarter] == 'Q1')
elif excl_merged['Month'] == 'April' & excl_merged['Month'] == 'May' & excl_merged['Month'] == 'June':
print(excl_merged[quarter] == 'Q2')
elif excl_merged['Month'] == 'July' & excl_merged['Month'] == 'August' & excl_merged['Month'] == 'September':
print(excl_merged[quarter] == 'Q3')
else:
print(excl_merged[quarter] == 'Q4')
convert_to_quarterly(excl_merged)
I was not able to run the function properly and hence was getting errors
CodePudding user response:
Try the following:
def convert_to_quarterly(excl_merged):
if excl_merged['Month'] in ['January', 'February', "March"]:
excl_merged[quarter] == 'Q1'
elif excl_merged['Month'] in ["April", "May", "June"]:
excl_merged[quarter] == 'Q2'
elif excl_merged['Month'] in ['July', 'August', 'September']:
excl_merged[quarter] == 'Q3'
elif excl_merged["Month"] in ["November", "December", "December"]:
excl_merged[quarter] == 'Q4'
else:
print("Unkown month name!")
The main problem is that you are using an and statement. A month can't be "Januar" and "Fabruary".
I would also recoment to use brackets when useing the & or | operator around the single bool operations.
At last i would recoment to use the in operator to test against all three values at one. It should be faster and the code is much easier to read.
CodePudding user response:
Wouldn't it be easier to do something like:
df.Transaction_Timestamp.apply(lambda x: "Q" str(x.quarter))
Example
import pandas as pd
import numpy as np
rng = np.random.default_rng()
df = pd.DataFrame({
"Transaction_Timestamp":pd.date_range("2022-01-01", periods=365),
"Value":rng.integers(0, 100, size=365)
})
df["Qrt"] = df.Transaction_Timestamp.apply(lambda x: "Q" str(x.quarter))
df.head()
Transaction_Timestamp Value Qrt
0 2022-01-01 84 Q1
1 2022-01-02 43 Q1
2 2022-01-03 91 Q1
3 2022-01-04 29 Q1
4 2022-01-05 88 Q1