I have a table like in the picture below. Based on the values of columns date and date2 I want to add new column date max. The new value should be equal to whichever is bigger (value can be both empty or string and date). Also, I need to add an agent column which, should be equal to the value whose length is greater than 2.
purchase={'date':['11/03/2019','12/05/2021','14/03/2021','11/03/2021'],
'date2':['11/04/2019','12/03/2021','14/06/2021','X'],
'price':[300, 400,200, 200],
'currency':['eur', 'usd','usd','usd'],
'qty':[200, 300, 400, 500],
'agent':['AC', 'BC', "CC", 'DC'],
'agent2':['AC', 'BC', "DDCC", 'DCC']}
df=pd.DataFrame(purchase)
CodePudding user response:
If you just need bigger values in both date, do this:
df["max_date"] = df[["date", "date2"]].apply(max, axis=1)
Assuming your columns contain proper dates and not just strings. for converting to dates first, see Omar's answer.
Bigger length agents names can be gotten same way:
df[["agent", "agent2"]].apply(lambda x: x[0] if len(x[0])>len(x[1]) else x[1], axis=1
CodePudding user response:
First you have to convert the two dates to a datetime type, then compare them, there are many ways to do that, the easiest is using numpy function .where():
df["date"] = pd.to_datetime(df["date"], infer_datetime_format=True)
df["date2"] = pd.to_datetime(df["date2"], infer_datetime_format=True, errors='coerce')
df["max"] = np.where(df["date"]>df["date2"], df["date"], df["date2"])
df["agent_x"] = np.where(df["agent"].str.len()> df["agent2"].str.len(), df["agent"], df["agent2"])
Result:
date date2 price currency qty agent agent2 max agent_x
0 2019-11-03 2019-11-04 300 eur 200 AC AC 2019-11-04 AC
1 2021-12-05 2021-12-03 400 usd 300 BC BC 2021-12-05 BC
2 2021-03-14 NaT 200 usd 400 CC DDCC NaT DDCC
3 2021-11-03 NaT 200 usd 500 DC DCC NaT DCC
In case you wanted to fill the missing dates with the available date, add this:
for i, r in df.iterrows():
if pd.isnull(r["max"]):
if pd.isnull(r["date"]):
df.at[i, "max"] = r["date2"]
else:
df.at[i, "max"] = r["date"]
Result:
date date2 price currency qty agent agent2 max agent_x
0 2019-11-03 2019-11-04 300 eur 200 AC AC 2019-11-04 AC
1 2021-12-05 2021-12-03 400 usd 300 BC BC 2021-12-05 BC
2 2021-03-14 NaT 200 usd 400 CC DDCC 2021-03-14 DDCC
3 2021-11-03 NaT 200 usd 500 DC DCC 2021-11-03 DCC