I am working on a DataFrame that looks like this. I'd like to choose the faster date of the two and put it into another column.
col1 | col2 | col3 |
---|---|---|
2015-01-03 | 2015-01-04 | |
2022-02-22 | 2017-01-02 |
and my desired out put is
col1 | col2 | col3 |
---|---|---|
2015-01-03 | 2015-01-04 | 2015-01-03 |
2022-02-22 | 2017-01-02 | 2017-01-02 |
So far I've tried
for i in range(len(df)):
if df.loc[i, "col1"] < df.loc[i, "col2"]:
df.loc[i, "col3"] = df.loc[i, "col1"]
else:
df.loc[i, "col3"] = df.loc[i, "col2"]
CodePudding user response:
You could use min
on axis:
df['col3'] = df[['col1','col2']].min(axis=1)
Output:
col1 col2 col3
0 2015-01-03 2015-01-04 2015-01-03
1 2022-02-22 2017-01-02 2017-01-02
CodePudding user response:
You can use apply
functions giving axis
attribute as 1
in order to run the function through the rows:
df["col1"] = pd.to_datetime(df["col1"])
df["col2"] = pd.to_datetime(df["col2"])
df["col3"] = df.apply(min, axis=1)
df
Note that, the first two lines are to make sure that the values are datetime objects.
Output
col1 col2 col3
0 2015-01-03 2015-01-04 2015-01-03
1 2022-02-22 2017-01-02 2017-01-02
CodePudding user response:
You can first convert your columns to datetime format and then use the where method:
import pandas as pd
data = pd.DataFrame.from_records([["2015-01-03", "2015-01-04"],
["2022-02-22", "2017-01-02"]],
columns=["col1", "col2"])
for col in data:
data[col] = pd.to_datetime(data[col])
data['col3'] = data['col1'].where(data['col1']<data['col2'], data['col2'])
Result looks like:
col1 col2 col3
0 2015-01-03 2015-01-04 2015-01-03
1 2022-02-22 2017-01-02 2017-01-02