How would I add 1 year to a column?
I've tried using map
and apply but I failed miserably.
I also wonder why pl.date()
accepts integers while it advertises that it only accepts str
or pli.Expr
.
A small hack workaround is:
col = pl.col('date').dt
df = df.with_column(pl.when(pl.col(column).is_not_null())
.then(pl.date(col.year() 1, col.month(), col.day()))
.otherwise(pl.date(col.year() 1,col.month(), col.day()))
.alias("date"))
but this won't work for months or days. I can't just add a number or I'll get a:
> thread 'thread '<unnamed>' panicked at 'invalid or out-of-range date<unnamed>',
' panicked at '/github/home/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rsinvalid or out-of-range date:', 173:/github/home/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.19/src/naive/date.rs51
:note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Most likely because day and month cycle while year goes to infinity.
I could also do this:
df = df.with_column(
pl.when(col.month() == 1)
.then(pl.date(col.year(), 2, col.day()))
.when(col.month() == 2)
.then(pl.date(col.year(), 3, col.day()))
.when(col.month() == 3)
.then(pl.date(col.year(), 4, col.day()))
.when(col.month() == 4)
.then(pl.date(col.year(), 5, col.day()))
.when(col.month() == 5)
.then(pl.date(col.year(), 6, col.day()))
.when(col.month() == 6)
.then(pl.date(col.year(), 7, col.day()))
.when(col.month() == 7)
.then(pl.date(col.year(), 8, col.day()))
.when(col.month() == 8)
.then(pl.date(col.year(), 9, col.day()))
.when(col.month() == 9)
.then(pl.date(col.year(), 10, col.day()))
.when(col.month() == 10)
.then(pl.date(col.year(), 11, col.day()))
.when(col.month() == 11)
.then(pl.date(col.year(), 12, col.day()))
.otherwise(pl.date(col.year() 1, 1, 1))
.alias("valid_from")
)
CodePudding user response:
Polars allows to do addition and subtraction with python's timedelta
objects.
We do still have to correct for leap years. We can use when, then, otherwise
for that.
import polars as pl
from datetime import timedelta
def is_leap_year(name: str) -> bool:
year = pl.col(name).dt.year()
return (year % 4 == 0) & (year % 100 != 0) | (year % 400 == 0)
(pl.DataFrame({
"dates": pl.date_range(datetime(2000, 1, 1), datetime(2026, 1, 1), "1y")
}).with_columns([
is_leap_year("dates").alias("leap_year"),
pl.when(is_leap_year("dates").alias("leap_year"))
.then(pl.col("dates") timedelta(days=366))
.otherwise(pl.col("dates") timedelta(days=365)).alias("dates_and_1_year")
]))
shape: (27, 3)
┌─────────────────────┬───────────┬─────────────────────┐
│ dates ┆ leap_year ┆ dates_and_1_year │
│ --- ┆ --- ┆ --- │
│ datetime[ns] ┆ bool ┆ datetime[ns] │
╞═════════════════════╪═══════════╪═════════════════════╡
│ 2000-01-01 00:00:00 ┆ true ┆ 2001-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2001-01-01 00:00:00 ┆ false ┆ 2002-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2002-01-01 00:00:00 ┆ false ┆ 2003-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2003-01-01 00:00:00 ┆ false ┆ 2004-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ... ┆ ... ┆ ... │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2023-01-01 00:00:00 ┆ false ┆ 2024-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2024-01-01 00:00:00 ┆ true ┆ 2025-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2025-01-01 00:00:00 ┆ false ┆ 2026-01-01 00:00:00 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2026-01-01 00:00:00 ┆ false ┆ 2027-01-01 00:00:00 │
└─────────────────────┴───────────┴─────────────────────┘
CodePudding user response:
You can use polars.apply
and dateutil.relativedelta
which works for years, months, days and much more, but can be slow for lots of data.
from datetime import date
from dateutil.relativedelta import relativedelta
df = pl.DataFrame(pl.date_range(date(2019, 1, 1), date(2020, 10, 1), '3mo', name='date'))
df.with_column(pl.col('date').apply(lambda x: x relativedelta(years=1)))