Home > OS >  Issue while converting string to DateTime in Polars
Issue while converting string to DateTime in Polars

Time:11-11

I am converting string column into datetime column...

Here is my input,

col
00000001011970
00000001011970
00000001011970
...
00000001011970

Here is my snippet,

df[col].with_column= df.with_column(pl.col(col).str.strptime(pl.Datetime, fmt='%Y-%m-%d %H:%M:%S',strict=False).alias('parsed EventTime') )
            

this above snippet is not converting into DateTime... the output is same as input... there is no change in output.

Please help me to convert string column to DateTime in polars.

CodePudding user response:

Your format clause should match the input.

df.with_column(pl.col('col').str.strptime(pl.Datetime, fmt='%S%M%H%d%m%Y',strict=False).alias('parsed EventTime'))

gives me

1970-01-01 00:00:00

CodePudding user response:

To add to 4pi's answer in the context of your comment...

polars is not pandas so you don't assign columns of existing dfs with df['col']= and you especially don't do it with df['col'].with_column

with_column means to keep everything there is plus add this other thing or if it's the same name as something that already exists then replace it with this.

That is in contrast to select which means to only give you what you ask for.

You can use the syntax of replacing the whole df like this

df = df.with_column(pl.col('col').str.strptime(pl.Datetime, fmt='%S%M%H%d%m%Y',strict=False).alias('parsed EventTime'))

polars AFAIK is smart enough not to copy anything when you do this so although the syntax implies it's inefficient, it is efficient.

If you were replacing a column then you could use the replace method ie.

df.replace('col',df.get_column('col').str.strptime(pl.Datetime, fmt='%S%M%H%d%m%Y',strict=False))

It's not clear from your question if you're trying to add a column called parsed EventTime or if you're trying to replace the existing col.

You can also replace columns with the first syntax...

df=df.select(pl.col('col').str.strptime(pl.Datetime, fmt='%S%M%H%d%m%Y',strict=False))

since there's only one column in this df, you would get the same result from

df=df.with_column(pl.col('col').str.strptime(pl.Datetime, fmt='%S%M%H%d%m%Y',strict=False))
  • Related