Home > Software engineering >  Polars counting elements in list column
Polars counting elements in list column

Time:12-08

I've have dataframe with column b with list elements, I need to create column c that counts number elements in list for every row. Here is toy example in Pandas:

import pandas as pd

df = pd.DataFrame({'a': [1,2,3], 'b':[[1,2,3], [2], [5,0]]})

    a   b
0   1   [1, 2, 3]
1   2   [2]
2   3   [5, 0]

df.assign(c=df['b'].str.len())

    a   b           c
0   1   [1, 2, 3]   3
1   2   [2]         1
2   3   [5, 0]      2

Here is my equivalent in Polars:

import polars as pl

dfp = pl.DataFrame({'a': [1,2,3], 'b':[[1,2,3], [2], [5,0]]})

dfp.with_columns(pl.col('b').apply(lambda x: len(x)).alias('c'))

I've a feeling that .apply(lambda x: len(x)) is not optimal.

Is a better way to do it in Polars?

CodePudding user response:

You can use .arr to access the list functions -- in this case .lengths()

>>> df.with_column(pl.col("b").arr.lengths().alias("c"))
shape: (3, 3)
┌─────┬───────────┬─────┐
│ a   | b         | c   │
│ --- | ---       | --- │
│ i64 | list[i64] | u32 │
╞═════╪═══════════╪═════╡
│ 1   | [1, 2, 3] | 3   │
├─────┼───────────┼─────┤
│ 2   | [2]       | 1   │
├─────┼───────────┼─────┤
│ 3   | [5, 0]    | 2   │
└─//──┴─//────────┴─//──┘
  • Related