Collect the data (two parameters) between two keywords (variable string) from ini file-CodePudding

I have a txt.ini file with content (I cannot modify the structure of this file):

[person_0:public]
name=john
groups=0,1,2
age=30

[person_0:private]
married=false
weight=190
height=100

[person_1:public]
name=mark
groups=0,4
age=28

[person_1:private]
married=false
weight=173
height=70

[person_2:public]
name=tony
groups=3,4
age=30

[person_3:private]
married=true
weight=202
height=120

I have a variable "person" which takes the value one of: person_0, person_1, person_3 in the loop and I would like to collect the person's data like age and groups for every 'person' one by one.

My idea is to cut out the part between $person:public and $person:private and after that collect

e.g. set variable person=person_1 output: groups=0,4 age=28

I prepared the code in bash (persons is a list of person_0, person_1, person2):

for person in ${persons[@]}; do
    file="txt.ini"
    echo "$person"
    a=$(awk -v a=$person":private" -v b=$person":public" '/a/{found=0} {if(found) print} /b/{found=1}' $file)

    IFS=$'\n' lines=($a)
    IFS='=' read grouplist grouplist_values <<< ${lines[1]}
    IFS='=' read age age_values <<< ${lines[4]}
    echo "Group list = $grouplist_values"
    echo "Age = $age_values"

Group list and age are empty. Output:

person_0
Group list =
Age =

person_1
Group list =
Age =

person_2
Group list =
Age =

Expected:

person_0
Group list =0,1,2
Age =30

person_1
Group list =0,4
Age =28

person_2
Group list =3,4
Age =30

I will use this data "per person" in another part of my code. I'm working on files with different number of "persons".

I really don't know what is wrong.

I tried also:

#export person="person_0"
#a=$(awk '/ENVIRON["person"]:private/{found=0} {if(found) print} /ENVIRON["person"]:public/{found=1}' $file)

and

private=$person":private"
public=$person":public"
echo "private=$private"
echo "public=$public"
a=$(awk -v a=$private" -v b=$public '/a/{found=0} {if(found) print} /b/{found=1}' $config_file)

but output was the same:

person_0
private=person_0:private
public=person_0:public
Group list =
Age =

What is strange for me - when I hardcode range of cutting it works properly:

a=$(awk '/person_0:private/{found=0} {if(found) print} /person_0:public/{found=1}' $file)

a=$(awk '/person_1:private/{found=0} {if(found) print} /person_1:public/{found=1}' $file)

Do you have any idea how can I collect this data in a clever way?

CodePudding user response：

Assumptions:

for a given person (eg, person_0) display said person along with the associated (public) fields for groups and age
no indication has been given for what we're suppose to do with this data so assume, for now, we just need to print to stdout
list of persons to process is in bash array persons[]
the strings :public and :private only show up in the block headers

One awk idea where we use the split() function to parse a line based on different delimiters:

awk '
FNR==NR    { persons[$1]
             next
           }
/:private/ { printme=0 }
/:public/  { printme=0

             split($1,arr,"[]:[]")
             person=arr[2]

             if (person in persons) {
                printme=1
                printf "%s%s\n", pfx, person
                pfx="\n"
             }
           }
printme    { split($1,arr,"=")
             if (arr[1] == "groups") print "Group list =" arr[2]
             if (arr[1] == "age")    print "Age ="        arr[2]
           }
' <(printf "%s\n" "${persons[@]}") txt.ini

A variation on this theme using a multi-character input field delimiter:

awk -F"[]:=[]" '
FNR==NR       { persons[$1]
                next
              }
$3=="private" { printme=0 }
$3=="public"  { printme=0
                if ($2 in persons) {
                   printme=1
                   printf "%s%s\n", pfx, $2
                   pfx="\n"
                }
              }
printme && $1=="groups" { print "Group list =" $2 }
printme && $1=="age"    { print "Age ="        $2 }
' <(printf "%s\n" "${persons[@]}") txt.ini

With:

$ typeset -p persons
declare -a persons=([0]="person_0" [1]="person_1" [2]="person_2")

Both sets of awk code generate:

person_0
Group list =0,1,2
Age =30

person_1
Group list =0,4
Age =28

person_2
Group list =3,4
Age =30

NOTE: this could be made more dynamic (public and/or private? different fields?) but that'll entail a bit more coding

CodePudding user response：

Would you please try the following:

awk -v RS='' '                          # split the records on the blank lines
/public/ {                              # "public" record
    split($1, a, /[\[:]/); print a[2]   # extract the "person_xx" substring
    for (i = 2; i <= NF; i  ) {         # iterate over the lines of the record
        split($i, a, /=/)
        if (a[1] == "groups") print "Group list =" a[2]
        else if (a[1] == "age") print "Age =" a[2]
    }
    print ""                            # insert a blank line
}
' txt.ini

Output:

person_0
Group list =0,1,2
Age =30

person_1
Group list =0,4
Age =28

person_2
Group list =3,4
Age =30

By setting awk variable RS to the null string, the records are separated by blank lines and the fields are separated by the newline character.
Assuming the desired data are included in the public block, we can parse the fields of the public record one by one.

[Edit]
According to the OP's comment, here is the updated version:

#!/bin/bash

persons=("person_0")                            # list of desired person(s)
for person in "${persons[@]}"; do               # loop over the bash array
    awk -v RS='' -v person="$person" '          # assign awk variables
    $1 ~ person ":public" {                     # "public" record of the person
        split($1, a, /[\[:]/); print a[2]       # extract the "person_xx" substring
        for (i = 2; i <= NF; i  ) {             # iterate over the lines of the record
            split($i, a, /=/)
            if (a[1] == "groups") print "Group list =" a[2]
            else if (a[1] == "age") print "Age =" a[2]
        }
    }
    ' txt.ini
    echo                                        # insert a blank line
done

You can assign the persons array to whoever you want.
The pattern $1 ~ person ":public" tests if the 1st field of the record $1 (e.g. [person_0:public]) matches the awk variable person (passed with the -v option) followed by a string ":public".

Obviously the awk script repeats reading the text.ini file multiple times as many as the #elements in the persons array. If the text.ini file is long and/or the persons array has many elements, the loop will be inefficient. Here is another variant:

#!/bin/bash

persons=("person_0" "person_1")         # bash array just for an example
awk -v RS='' -v persons_list="${persons[*]}" '
                                        # persons_list is a blank separated list of persons
BEGIN {
    split(persons_list, a)              # split persons_list back to an array
    for (i in a) persons[a[i]]          # create a new array indexed by person
}
/public/ {                              # "public" record
    split($1, a, /[\[:]/)               # extract the "person_xx" substring
    if (a[2] in persons) {              # if the person exists in the list
        print a[2]
        for (i = 2; i <= NF; i  ) {     # iterate over the lines of the record
            split($i, a, /=/)
            if (a[1] == "groups") print "Group list =" a[2]
            else if (a[1] == "age") print "Age =" a[2]
        }
        print ""                        # insert a blank line
    }
}
' txt.ini

Please note it assumes the person string does not contain whitespace characters. If so, change the delimiter when assigning the persons_list to an unused character such as a comma.