Home > database >  Spark filter on dataframe with array containing a map
Spark filter on dataframe with array containing a map

Time:12-22

I have a dataframe with schema which has a nested array of map values:

root
 |-- array_of_properties: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- name: string (nullable = true)
 |    |    |-- props: map (nullable = true)
 |    |    |    |-- key: string
 |    |    |    |-- value: string (valueContainsNull = true)

I need to filter on the struct name and some specific key's values in the map inside the array. I can filter on the name:

dataframe.filter(array_contains(col("array_of_properties.name"), "somename"))

How do I add AND filters on values of two keys in the nested props map (for example a key name is_enabled with a boolean value of true or false, and a key name of source with a string value of test) ?

CodePudding user response:

Use exists function:

dataframe.filter("exists(array_of_properties, x -> x.name = 'somename' and x.props['is_enabled'] is true)")
  • Related