Home > front end >  How to extract a value by searching for two words in different lines and getting the value of second
How to extract a value by searching for two words in different lines and getting the value of second

Time:01-12

How to search for a word, once it's found, in the next line save a specific value in a variable.

The json bellow is only a small part of the file.

Due to this specific file json structure be inconsistent and subject to change overtime, it need to by done via search like grep sed awk.

however the paramenters bellow will be always the same.

  1. search for the word next
  2. get the next line bellow it
  3. extract everything after the word page_token not the boundary "
  4. store in a variable to be used
test.txt:
"link": [
    {
      "relation": "search",
      "url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?token=gggggggg3444"
    },
    {
      "relation": "next",
      "url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?&_page_token=121_%_@212absa23bababa121212121212121"
    },
]

so the desired output in this case is:

PAGE_TOKEN="121_%_@212absa23bababa121212121212121"

my attempt:

PAGE_TOKEN=$(cat test.txt| grep "next" | sed 's/^.*: *//;q')

no lucky..

CodePudding user response:

Presuming your input is valid json, one option is to use:

cat test.json
[{
        "relation": "search",
        "url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?token=gggggggg3444"
    },
    {
        "relation": "next",
        "url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?&_page_token=121_%_@212absa23bababa121212121212121"
    }
]

PAGE_TOKEN=$(cat test.json | jq -r '.[] | select(.relation=="next") | .url | gsub(".*=";"")')
echo "$PAGE_TOKEN"
121_%_@212absa23bababa121212121212121

CodePudding user response:

This might work for you (GNU sed):

sed -En '/next/{n;s/.*(page_token=)([^"]*).*/\U\1\E"\2"/p}' file

This is essentially a filtering operation, hence the use of the -n option.

Find a line containing next, fetch the next line, format as required and print the result.

  • Related