Home > other >  How to compare two columns and input the smaller one in a new column in pandas?
How to compare two columns and input the smaller one in a new column in pandas?

Time:03-21

I am working on a DataFrame that looks like this. I'd like to choose the faster date of the two and put it into another column.

col1 col2 col3
2015-01-03 2015-01-04
2022-02-22 2017-01-02

and my desired out put is

col1 col2 col3
2015-01-03 2015-01-04 2015-01-03
2022-02-22 2017-01-02 2017-01-02

So far I've tried

for i in range(len(df)):
    if df.loc[i, "col1"] < df.loc[i, "col2"]:
        df.loc[i, "col3"] = df.loc[i, "col1"]
    else:
        df.loc[i, "col3"] = df.loc[i, "col2"]

CodePudding user response:

You could use min on axis:

df['col3'] = df[['col1','col2']].min(axis=1)

Output:

         col1        col2        col3
0  2015-01-03  2015-01-04  2015-01-03
1  2022-02-22  2017-01-02  2017-01-02

CodePudding user response:

You can use apply functions giving axis attribute as 1 in order to run the function through the rows:

df["col1"] = pd.to_datetime(df["col1"])
df["col2"] = pd.to_datetime(df["col2"])

df["col3"] = df.apply(min, axis=1)
df

Note that, the first two lines are to make sure that the values are datetime objects.

Output

    col1        col2        col3
0   2015-01-03  2015-01-04  2015-01-03
1   2022-02-22  2017-01-02  2017-01-02

CodePudding user response:

You can first convert your columns to datetime format and then use the where method:

import pandas as pd
data = pd.DataFrame.from_records([["2015-01-03",  "2015-01-04"], 
                                  ["2022-02-22",  "2017-01-02"]], 
                                 columns=["col1", "col2"])
for col in data:
    data[col] = pd.to_datetime(data[col])
data['col3'] = data['col1'].where(data['col1']<data['col2'], data['col2'])

Result looks like:

        col1       col2       col3
0 2015-01-03 2015-01-04 2015-01-03
1 2022-02-22 2017-01-02 2017-01-02
  • Related