Home > other >  jq - Use regex in the json in test function
jq - Use regex in the json in test function

Time:11-22

I have some json that contains regex

[
  {
     "name": "For teachers",
     "regex": "^Apple "
  },
  {
     "name": "Long and yellow",
     "regex": "banana$"
  },
  {
     "name": "Cantaloupe",
     "regex": ".*/melon/.*"
  }
]

What I wanted to do is use the .regex value in the test function e.g.

>>> jq '. | select( "path/to/melon/data" | test( .regex ) )' test.json
jq: error (at test.json:14): Cannot index string with string "regex"

I am trying to check if a string, passed in anyhow, matches any of the .regex in the json and if it does return the corresponding .name.

In the above test.json, passing a string => output:

  • starting "Apples " => "For teachers"
  • ending "banana" => "Long and yellow"
  • containing the text "/melons/" => "Canteloupe"

If there are multiple matches then return all the .name values where the .regex matches the passed in string. So from the comments: "Apple or Is this melon a banana" => [ "For teachers", "Long and yellow", "Canteloupe" ]


I was considering trying something like building a sed command but I have not got that far and I think adding what I had was causing confusion rather than clarifying. Leaving it here so the comments make sense.

>>> echo "path/to/melon/data" | sed -E -e 's#^Apple #For teachers #g' -e 's#banana$#long and yellow#g' -e 's#.*/melon/.*#Cantaloupe#g'
Cantaloupe
>>> echo "Is this banana" | sed -E -e 's#^Apple #For teachers#g' -e 's#banana$#long and yellow#g' -e 's#.*/melon/.*#Cantaloupe#g'
Is this long and yellow

I want a behaviour similar to this sed, but without me needing to construct lots of -e options from jq print.

I am sure I can get something like that to work, so every time the echoed in string matches a .regex it returns the corresponding .name but that is such a hack ... even for me! (Note: This sed is not doing what I want except in the melon case, because it is replacing the text matched rather than responding with the text)

CodePudding user response:

You could use the --raw-input (or -R) option to read in the string, and the --argfile option to read in the regex JSON file:

echo "path/to/melon/data" |
jq -Rr --argfile rs regex.json '
  $rs[] as $r
  | if test($r.regex) then $r.name else empty end
'
Cantaloupe

If the string is contained somewhere in your input JSON, you obviously don't need the --raw-input (or -R) option:

jq -r --argfile rs regex.json '
  ...traverse to the string... | $rs[] as $r
  | if test($r.regex) then $r.name else empty end
' input.json

CodePudding user response:

It sounds like you want to apply your array of regexes in turn to your input string. In other words: reducing your input by aggregating the results of a substitution operation.

reduce $regex[0][] as $re (.; gsub($re.regex; $re.name))

Provide $regex via --slurpfile and make sure to read raw input (-R) and write raw output (-r):

$ echo "Is this banana" | jq -Rr --slurpfile regex test.json 'reduce $regex[0][] as $re (.; gsub($re.regex; $re.name))'
Is this Long and yellow

All substitutions are applied:

$ echo "Apple or Is this melon a banana" | jq -Rr --slurpfile regex test.json 'reduce $regex[0][] as $re (.; gsub($re.regex; $re.name))'
For teahcersor Is this melon a Long and yellow

And would be applied to already-substituted strings. So if you had "^a"=>"x" and "x"=>"y", then the input "abc" would end up as "ybc".

$ echo "Apple or Is this /melon/ a banana" | jq -Rr --slurpfile regex test.json 'reduce $regex[0][] as $re (.; gsub($re.regex; $re.name))'
Cantaloupe

And if you only want to print the first substitution which is possible, the following could work (although I'm sure there's a smarter way, this looks way to convoluted):

$ echo "Apple or Is this /melon/ a banana" | jq -Rr --slurpfile regex test.json '
. as $in 
| $regex[0] 
| map(
  . as $re
  | $in
  | select(test($re.regex))
  | gsub($re.regex; $re.name)
) 
| first
'
For teahcersor Is this /melon/ a banana
  • Related