I have the following dataframe:
df = (
pl.DataFrame(
{
"int": [1, 2, 3],
"date": ["2010-01-31T23:00:00 00:00","2010-02-01T00:00:00 00:00","2010-02-01T01:00:00 00:00"]
}
)
.with_columns(
pl.col("date").str.strptime(pl.Datetime, fmt="%Y-%m-%dT%H:%M:%S%z")
.dt.with_time_zone("Europe/Amsterdam")
)
)
which gives:
┌─────┬────────────────────────────────┐
│ int ┆ date │
│ --- ┆ --- │
│ i64 ┆ datetime[μs, Europe/Amsterdam] │
╞═════╪════════════════════════════════╡
│ 1 ┆ 2010-02-01 00:00:00 CET │
│ 2 ┆ 2010-02-01 01:00:00 CET │
│ 3 ┆ 2010-02-01 02:00:00 CET │
└─────┴────────────────────────────────┘
I would like to convert this datetime type to a string with a time zone designator, e.g. 2010-02-01 00:00:00 01:00
I tried the following:
df.with_columns(pl.col("date").dt.strftime(fmt="%Y-%m-%d %H:%M:%S%z"))
which gives the following error:
pyo3_runtime.PanicException: a formatting trait implementation returned an error: Error
My desired output is stated below, which is what you get when you convert a datetime column to a string type in pandas with the "%Y-%m-%d %H:%M:%S%z"
as the format:
┌─────┬──────────────────────────┐
│ int ┆ date │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪══════════════════════════╡
│ 1 ┆ 2010-02-01 00:00:00 0100 │
│ 2 ┆ 2010-02-01 01:00:00 0100 │
│ 3 ┆ 2010-02-01 02:00:00 0100 │
└─────┴──────────────────────────┘
Is there any way to realize this result? Leaving out the %z
at the end when specifying the format works but the UTC time offset is something I need.
CodePudding user response:
Unfortunately, this looks like a bug.
As a temporary workaround, you can resort to using apply
(which is really slow):
(
df
.with_columns(
pl.col('date').apply(lambda d: d.strftime("%F %T%z")).alias('strftime')
)
)
shape: (3, 3)
┌─────┬────────────────────────────────┬──────────────────────────┐
│ int ┆ date ┆ strftime │
│ --- ┆ --- ┆ --- │
│ i64 ┆ datetime[μs, Europe/Amsterdam] ┆ str │
╞═════╪════════════════════════════════╪══════════════════════════╡
│ 1 ┆ 2010-02-01 00:00:00 CET ┆ 2010-02-01 00:00:00 0100 │
│ 2 ┆ 2010-02-01 01:00:00 CET ┆ 2010-02-01 01:00:00 0100 │
│ 3 ┆ 2010-02-01 02:00:00 CET ┆ 2010-02-01 02:00:00 0100 │
└─────┴────────────────────────────────┴──────────────────────────┘
CodePudding user response:
(On mobile so bear with me)
I don't think this is a bug. I think what's happening is that when you strptime
with the offset, that doesn't uniquely define a time zone even though it's showing a timezone (perhaps that's your local time zone). I think all you need to do assuming you have at least version .16 is add utc=True
to your strptime
.