I am trying to exclude a text-blocks when a certain condition occurs.
The files have this layout:
- name: Sedan
tags:
- DIGIT
- ABC
- DEF
- YES
- name: Combi
tags:
- DIGIT
- ABC
- DEF
- NO
- nane: SUV
tags:
- DIGIT
- DEF
- YES
- nane: OTHER
tags:
- DIGIT
- ABC
- YES
The condition is: ABC && !DEF
So, print only the text-block that will have only ABC
in the block.
It should give me this printout:
- nane: OTHER
tags:
- DIGIT
- ABC
- YES
My first try was something like that:
awk '/^- name:/ { if (found && value) {print value} found=value="" } { value=(value?value ORS:"")$0 } /ABC/ && !/DEF/ { found=1 } END { if (found && value) { print value } }' file
But the above try prints every text-block with both patterns!
Thanks
CodePudding user response:
Using gnu-awk
, you can split file into records using first -
in each block:
awk -v RS='(^|\n)- ' '/- ABC/ && !/- DEF/ {printf "- %s", $0}' file
- nane: OTHER
tags:
- DIGIT
- ABC
- YES
Or to make it more precise:
awk -v RS='(^|\n)- ' '
/- ABC(\n|$)/ && !/- DEF(\n|$)/ {printf "- %s", $0}
' file
CodePudding user response:
I'm normally not a fan of multiple instances of awk/sed/grep
in a pipeline, but this problems seems suited to it. First, insert blank lines as record separators. Then filter. Then remove the blank lines:
awk '/^-/{print ""} 1' input | awk '/ABC/ && !/DEF/' RS= | sed '/^$/d'
Some versions of awk
allow multi-character RS, but this pipeline seems simple enough to use with those implementations of awk
that do not support that extension.
But it seems that a better solution would be to convert the yaml to json, then filter with jq
, and then convert back to yaml.