When applying pandas.to_numeric,Pandas return dtype is float64 or int64 depending on the data supplied.https://pandas.pydata.org/docs/reference/api/pandas.to_numeric.html
is there an equivelent to do this in polars?
I have seen this How to cast a column with data type List[null] to List[i64] in polars however dont want to individually cast each column. got couple of string columns i want to turn numeric. this could be int or float values
#code to show casting in pandas.to_numeric
import pandas as pd
df = pd.DataFrame({"col1":["1","2"], "col2":["3.5", "4.6"]})
print("DataFrame:")
print(df)
df[["col1","col2"]]=df[["col1","col2"]].apply(pd.to_numeric)
print(df.dtypes)
CodePudding user response:
Unlike Pandas, Polars is quite picky about datatypes and tends to be rather unaccommodating when it comes to automatic casting. (Among the reasons is performance.)
You can create a feature request for a to_numeric
method (but I'm not sure how enthusiastic the response will be.)
That said, here's some easy ways to accomplish this.
Create a method
Perhaps the simplest way is to write a method that attempts the cast to integer and then catches the exception. For convenience, you can even attach this method to the Series
class itself.
def to_numeric(s: pl.Series) -> pl.Series:
try:
result = s.cast(pl.Int64)
except pl.exceptions.ComputeError:
result = s.cast(pl.Float64)
return result
pl.Series.to_numeric = to_numeric
Then to use it:
(
pl.select(
s.to_numeric()
for s in df
)
)
shape: (2, 2)
┌──────┬──────┐
│ col1 ┆ col2 │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞══════╪══════╡
│ 1 ┆ 3.5 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2 ┆ 4.6 │
└──────┴──────┘
Use the automatic casting of csv parsing
Another method is to write your columns to a csv file (in a string buffer), and then have read_csv
try to infer the types automatically. You may have to tweak the infer_schema_length
parameter in some situations.
from io import StringIO
pl.read_csv(StringIO(df.write_csv()))
>>> pl.read_csv(StringIO(df.write_csv()))
shape: (2, 2)
┌──────┬──────┐
│ col1 ┆ col2 │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞══════╪══════╡
│ 1 ┆ 3.5 │
├╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2 ┆ 4.6 │
└──────┴──────┘