Home > Software engineering >  How to compare date values from rows in python polars?
How to compare date values from rows in python polars?

Time:09-01

I have a dataframe with date of births as

pl.DataFrame({'idx':[1,2,3,4,5,6],
              'date_of_birth':['03/06/1990','3/06/1990','11/12/2000','01/02/2021','1/02/2021','3/06/1990']})

enter image description here

Here I would like to compare date of birth(Format: Month/Day/Year) of each row and tag yes if the months are equal such as 03 - 3, 01 -1.

There are dates as 03/06/1900, 3/06/1990, they are generally same. but here they are treated as different. How to figure out these kind scenarios ?

The expected output as:

enter image description here

CodePudding user response:

Let's convert date_of_birth to datetime then check if the month is satisfied

import numpy as np
import pandas as pd
import polars as pl

s = np.where(pd.to_datetime(df['date_of_birth']).month.isin([3, 1]), 'Yes', 'No')

out = df.with_columns([
    pl.lit(s).alias('flag')
])
print(out)

┌─────┬───────────────┬──────┐
│ idx ┆ date_of_birth ┆ flag │
│ --- ┆ ---           ┆ ---  │
│ i64 ┆ str           ┆ str  │
╞═════╪═══════════════╪══════╡
│ 1   ┆ 03/06/1990    ┆ Yes  │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2   ┆ 3/06/1990     ┆ Yes  │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3   ┆ 11/12/2000    ┆ No   │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 4   ┆ 01/02/2021    ┆ Yes  │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 5   ┆ 1/02/2021     ┆ Yes  │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 6   ┆ 3/06/1990     ┆ Yes  │
└─────┴───────────────┴──────┘

CodePudding user response:

You can try out with transform function to filter in specific column of df to get suitable output

df['match']=df['date_of_birth'].transform(lambda x : 'yes' if x.split("/")[0] in ['03','3','1','01'] else 'no')

Output:

idx date_of_birth   match
0   1   03/06/1990  yes
1   2   3/06/1990   yes
2   3   11/12/2000  no
3   4   01/02/2021  yes
4   5   1/02/2021   yes
5   6   3/06/1990   yes
  • Related