Home > Mobile >  Filtering elements in nested list with json_query
Filtering elements in nested list with json_query

Time:05-19

I'd like to filter out entries from a nested list depending on a presence of a key. Given the following dictionary:

hostvars:
  host1:
    backups:
      - src: somesource
        target: sometarget
      - src: anothersource
  host2:
    backups:
      - src: somesource for host2
        target: fancy target
  host3:
    backups:
      - src: yet another src
      - src: and another one

I'd like to filter out all elements of the backups lists when the target key is not present.

The closest I've come is:

- set_fact:
    data: "{{ hostvars | dict2items | json_query(query) }}"
  vars:
    query: "[?value.backups[?target]]"

which results in

hostvars:
  host1:
    backups:
      - src: somesource
        target: sometarget
      - src: anothersource
  host2:
    backups:
      - src: somesource for host2
        target: fancy target

So I've successfully filtered out host3 which does not have an element containing the target key in the backups list.
However, I'd also like to remove the second element from the backups list of host1 (which also does not contain the target key).

Any pointers are greatly appreciated.

CodePudding user response:

In a pure JMESPath way, you could recreate the value property doing a merge of the actual value with a filtered backups property with the query:

[].{
  key: key, 
  value: merge(value, {backups: value.backups[?target]})
} | [?value.backups]

So, given the task:

- debug:
    var: hostvars | dict2items | json_query(query) | items2dict
  vars:
    query: >-
      [].{
        key: key,
        value: merge(value, {backups: value.backups[?target]})
      } | [?value.backups]

This yields:

hostvars | dict2items | json_query(query) | items2dict:
  host1:
    backups:
    - src: somesource
      target: sometarget
  host2:
    backups:
    - src: somesource for host2
      target: fancy target

CodePudding user response:

For example

    - set_fact:
        data: "{{ hostvars|
                  dict2items|
                  json_query(_query)|
                  selectattr('value')|
                  items2dict }}"
      vars:
        _query: '[].{key: key, value: value.backups[?target]}'
      run_once: true

gives

  data:
    host1:
    - src: somesource
      target: sometarget
    host2:
    - src: somesource for host2
      target: fancy target

The next option is adding the default attribute target: None if missing, e.g.

    - set_fact:
        backups: "{{ [{'target': None}]|product(backups)|map('combine') }}"
    - debug:
        var: backups

gives

TASK [debug] ****************************************************
ok: [host1] => 
  backups:
  - src: somesource
    target: sometarget
  - src: anothersource
    target: null
ok: [host2] => 
  backups:
  - src: somesource for host2
    target: fancy target
ok: [host3] => 
  backups:
  - src: yet another src
    target: null
  - src: and another one
    target: null

Then, select 'not null' targets

    - set_fact:
        data: "{{ hostvars|
                  json_query('*.backups')|
                  map('selectattr', 'target')|
                  flatten }}"
      run_once: true
    - debug:
        var: data
      run_once: true

gives

TASK [debug] ****************************************************
ok: [host1] => 
  data:
  - src: somesource
    target: sometarget
  - src: somesource for host2
    target: fancy target

To identify the host add this attribute to the lists too, .e.g

    - set_fact:
        backups: "{{ [{'target': None, 'host': inventory_hostname}]|
                     product(backups)|map('combine') }}"

gives the result

TASK [debug] ****************************************************
ok: [host1] => 
  data:
  - host: host1
    src: somesource
    target: sometarget
  - host: host2
    src: somesource for host2
    target: fancy target
  • Related