Home > Back-end >  Combine different values of multiple columns into one column
Combine different values of multiple columns into one column

Time:09-06

Need help to "translate" a python example to rust. The python example was given here

Here is the code snippet I try to make work:

use polars::prelude::*;

fn main() {
    let s1 = Series::new("Fruit", &["Apple", "Apple", "Pear"]);
    let s2 = Series::new("Color", &["Red", "Yellow", "Green"]);

    let df = DataFrame::new(vec![s1, s2]).unwrap();

    let df_lazy = df.lazy();

    /*

    This is the PYTHON version I like to recreate in RUST:

    df_lazy.with_columns([
                    # string fmt over multiple expressions
                    pl.format("{} has {} color", "Fruit", "Color").alias("fruit_list"),
                    # columnar lambda over multiple expressions
                    pl.map(["Fruit", "Color"], lambda s: s[0]   " has "   s[1]   " color" ).alias("fruit_list2"),
                    ])
     */

}

I can't even get a simple select to work?! Now I am lost.

CodePudding user response:

The LazyFrame has a slightly different interface for .select than the regular DataFrame. It is expecting an iterable set of column expressions, built using the col() method. You can change your select call to the following:

let selected = df_lazy.select(&[col("Fruit"), col("Color")]);

println!("{:?}", selected.collect());

To get the results:

Ok(shape: (3, 2)
┌───────┬────────┐
│ Fruit ┆ Color  │
│ ---   ┆ ---    │
│ str   ┆ str    │
╞═══════╪════════╡
│ Apple ┆ Red    │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ Apple ┆ Yellow │
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ Pear  ┆ Green  │
└───────┴────────┘)

You can see more examples of working with the LazyFrame here: https://docs.rs/polars-lazy/latest/polars_lazy/

CodePudding user response:

I got a little further ... but ran into another snag:

The Black Box Function example should do the trick, but I can't get it to work:

Err(SchemaMisMatch("Series of dtype: List(Float64) != Struct"))

at this line let ca = s.struct_()?;

Here is the sample code:

use polars::prelude::*;

fn my_black_box_function(a: f32, b: f32) -> f32 {
    // do something
    a
}

fn apply_multiples(lf: &LazyFrame) -> Result<DataFrame> {
    df![
        "col_a" => [1.0, 2.0, 3.0],
        "col_b" => [3.0, 5.1, 0.3]
    ]?
    .lazy()
    .select([concat_lst(["col_a", "col_b"]).map(
        |s| {

            let ca = s.struct_()?;

            let b = ca.field_by_name("col_a")?;
            let a = ca.field_by_name("col_b")?;
            let a = a.f32()?;
            let b = b.f32()?;

            let out: Float32Chunked = a
                .into_iter()
                .zip(b.into_iter())
                .map(|(opt_a, opt_b)| match (opt_a, opt_b) {
                    (Some(a), Some(b)) => Some(my_black_box_function(a, b)),
                    _ => None,
                })
                .collect();

            Ok(out.into_series())
        },
        GetOutput::from_type(DataType::Float32),
    )])
    .collect()
}

The two series are concatenated as 's':

shape: (3,)
Series: 'col_a' [list]
[
        [1.0, 3.0]
        [2.0, 5.1]
        [3.0, 0.3]
]

but I can not make a struct_ out of it?!

  • Related