Search field and display next data to it-CodePudding

Is there an easiest way to search the following data with specific field based on the field @@id:?

This is the sample data file called sample

@@id: 123 @@name: John Doe @@age: 18 @@Gender: Male

@@id: 345 @@name: Sarah Benson @@age: 20 @@Gender: Female

For example, If I want to search an ID of 123 and his gender I would do this:

Basically this is the prototype that I want:

# search.sh
#!/bin/bash

# usage: search.sh <id> <field>
# eg: search 123 age

search="$1"

field="$2"

grep "^@@id: ${search}" sample | # FILTER <FIELD>

So when I search an ID 123 like below:

search.sh 123 gender

The output would be

Male

Up until now, based on the code above, I only able to grep one line based on ID, and I'm not sure what is the best method or fastest method with less complicated to get its next value after specifying the field (eg. age)

CodePudding user response：

1st solution: With your shown samples, please try following bash script. This considers that you want to match exact string match.

cat script.bash
#!/bin/bash

search="$1"
field="$2"

awk -v search="$search" -v field="$field" '
match($0,"@@id:[[:space:]]*"search){
  value=""
  match($0,"@@"field":[[:space:]]*[^@] ")
  value=substr($0,RSTART,RLENGTH)
  sub(/.*:  /,"",value)
  print value
}
'  Input_file

2nd solution: In case you want to search strings(values) irrespective of their cases(lower/upper case) in each line then try following code.

cat script.bash
#!/bin/bash

search="$1"
field="$2"

awk -v search="$search" -v field="$field" '
match(tolower($0),"@@id:[[:space:]]*"tolower(search)){
  value=""
  match(tolower($0),"@@"tolower(field)":[[:space:]]*[^@] ")
  value=substr($0,RSTART,RLENGTH)
  sub(/.*:  /,"",value)
  print value
}
'  Input_file

Explanation: Simple explanation of code would be, creating BASH script, which is expecting 2 parameters while its being run. Then passing these parameters as values to awk program. Then using match function to match the id in each line and print the value of passed field(eg: name OR Gender etc).

CodePudding user response：

Since you want to extract a part of each line found, different from the part you are matching against, sed or awk would be a better tool than grep. You could pipe the output of grep into one of the others, but that's wasteful because both sed and awk can do the line selection directly. I would do something like this:

#!/bin/bash

search="$1"
field="$2"

sed -n "/^@@id: ${search}"'\>/ { s/.*@@'"${field}"': *//i; s/ *@@.*//; p }' sample

Explanation:

sed is instructed to read file sample, which it will do line by line.
The -n option tells sed to suppress its usual behavior of automatically outputting its pattern space at the end of each cycle, which is an easy way to filter out lines that don't match the search criterion.
The sed expression starts with an address, which in this case is a pattern matching lines by id, according to the script's first argument. It is much like your grep pattern, but I append \>, which matches a word boundary. That way, searches for id 123 will not also match id 1234.
The rest of the sed expression edits out the everything in the line except the value of the requested field, with the field name being matched case-insensitively, and prints the result. The editing is accomplished by the two s/// commands, and the p command is of course for "print". These are all enclosed in curly braces ({}) and separated by semicolons (;) to form a single compound associated with the given address.

CodePudding user response：

Assumptions:

'label' fields have format @@<string>:
need to handle case-insensitive searches
'label' fields could be located anywhere in the line (ie, there is no set ordering of 'label' fields)
the 1st input search parameter is always a value associated with the @@id: label
the 2nd input search parameter is to be matched as a whole word (ie, no partial label matching; nam will not match against @@name:)
if there are multiple 'label' fields that match the 2nd input search parameter we print the value associated with the 1st match found in the line)

One awk idea:

awk -v search="${search}" -v field="${field}" '
BEGIN    { field = tolower(field) }
         { n=split($0,arr,"@@|:")                             # split current line on dual delimiters "@@" and ":", place fields into array arr[]

           found_search = 0
           found_field  = 0

           for (i=2;i<=n;i=i 2) {                             # loop through list of label fields
               label=tolower(arr[i])
               value = arr[i 1]
               sub(/^[[:space:]] /,"",value)                  # strip leading white space
               sub(/[[:space:]] $/,"",value)                  # strip trailing white space

               if ( label == "id"  &&   value == search ) 
                  found_search = 1
               if ( label == field && ! found_field )
                  found_field  = value
           }

           if ( found_search && found_field )
              print found_field
         }
' sample

Sample input:

$ cat sample
@@id: 123 @@name: John Doe @@age: 18 @@Gender: Male
@@id: 345 @@name: Sarah Benson @@age: 20 @@Gender: Female
@@name:    Archibald P. Granite, III, Ph.D, M.D.    @@age: 20 @@Gender: not specified @@id: 567

Test runs:

search=123 field=gender  => Male
search=123 field=ID      => 123
search=123 field=Age     => 18
search=345 field=name    => Sarah Benson
search=567 field=name    => Archibald P. Granite, III, Ph.D, M.D.
search=567 field=GENDER  => not specified
search=999 field=age     => <no output>