Need help to "translate" a python example to rust. The python example was given here
Here is the code snippet I try to make work:
use polars::prelude::*;
fn main() {
let s1 = Series::new("Fruit", &["Apple", "Apple", "Pear"]);
let s2 = Series::new("Color", &["Red", "Yellow", "Green"]);
let df = DataFrame::new(vec![s1, s2]).unwrap();
let df_lazy = df.lazy();
/*
This is the PYTHON version I like to recreate in RUST:
df_lazy.with_columns([
# string fmt over multiple expressions
pl.format("{} has {} color", "Fruit", "Color").alias("fruit_list"),
# columnar lambda over multiple expressions
pl.map(["Fruit", "Color"], lambda s: s[0] " has " s[1] " color" ).alias("fruit_list2"),
])
*/
}
I can't even get a simple select to work?! Now I am lost.
CodePudding user response:
The LazyFrame has a slightly different interface for .select
than the regular DataFrame. It is expecting an iterable set of column expressions, built using the col()
method. You can change your select call to the following:
let selected = df_lazy.select(&[col("Fruit"), col("Color")]);
println!("{:?}", selected.collect());
To get the results:
Ok(shape: (3, 2)
┌───────┬────────┐
│ Fruit ┆ Color │
│ --- ┆ --- │
│ str ┆ str │
╞═══════╪════════╡
│ Apple ┆ Red │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ Apple ┆ Yellow │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ Pear ┆ Green │
└───────┴────────┘)
You can see more examples of working with the LazyFrame here: https://docs.rs/polars-lazy/latest/polars_lazy/
CodePudding user response:
I got a little further ... but ran into another snag:
The Black Box Function example should do the trick, but I can't get it to work:
Err(SchemaMisMatch("Series of dtype: List(Float64) != Struct"))
at this line let ca = s.struct_()?;
Here is the sample code:
use polars::prelude::*;
fn my_black_box_function(a: f32, b: f32) -> f32 {
// do something
a
}
fn apply_multiples(lf: &LazyFrame) -> Result<DataFrame> {
df![
"col_a" => [1.0, 2.0, 3.0],
"col_b" => [3.0, 5.1, 0.3]
]?
.lazy()
.select([concat_lst(["col_a", "col_b"]).map(
|s| {
let ca = s.struct_()?;
let b = ca.field_by_name("col_a")?;
let a = ca.field_by_name("col_b")?;
let a = a.f32()?;
let b = b.f32()?;
let out: Float32Chunked = a
.into_iter()
.zip(b.into_iter())
.map(|(opt_a, opt_b)| match (opt_a, opt_b) {
(Some(a), Some(b)) => Some(my_black_box_function(a, b)),
_ => None,
})
.collect();
Ok(out.into_series())
},
GetOutput::from_type(DataType::Float32),
)])
.collect()
}
The two series are concatenated as 's':
shape: (3,)
Series: 'col_a' [list]
[
[1.0, 3.0]
[2.0, 5.1]
[3.0, 0.3]
]
but I can not make a struct_
out of it?!