Computing the size of array in text file in bash-CodePudding

I have a text file that sometimes-not always- will have an array with a unique name like this

unique_array=(1,2,3,4,5,6)

I would like to find the size of the array-6 in the above example- when it exists and skip it or return -1 if it doesnt exist.

grepping the file will tell me if the array exists but not how to find its size.

The array can fill multiple lines like

unique_array=(1,2,3,
   4,5,6,
7,8,9,10)

Some of the elements in the array can be negative as in

unique_array=(1,2,-3,
   4,5,6,
7,8,-9,10)

CodePudding user response：

awk -v RS=\) -F, '/unique_array=\(/ {print /[0-9]/?NF:0}' file.txt

-v RS=\) - delimit records by ) instead of newlines
-F, - delimit fields by , instead of whitespace
/unique_array=(/ - look for a record containing the unique identifier
/[0-9]?NF:0 - if record contains digit, number of fields (ie. commas 1), otherwise 0

CodePudding user response：

Your specifications are woefully incomplete, but guessing a bit as to what you are actually looking for, try this at least as a starting point.

awk '/^unique_array=\(/ { in_array = 1; n = split(",", arr, $0); next }
    in_array && /\)/ { sub(/\)./, ""); quit = 1 }
    in_array { n  = split(",", arr, $0);
      if (quit) { print n; in_array = quit = n = 0 } }' file

We keep a state variable in_array which tells us whether we are currently in a region which contains the array. This gets set to 1 when we see the beginning of the array, and back to 0 when we see the closing parenthesis. At this point, we remove the closing parenthesis and everything after it, and set a second variable quit to trigger the finishing logic in the next condition. The last condition performs two tasks; it adds the items from this line to the count in n, and then checks if quit is true; if it is, we are at the end of the array, and print the number of elements.

This will simply print nothing if the array was not found. You could embellish the script to set a different exit code or print -1 if you like, but these details seem like unnecessary complications for a simple script.

CodePudding user response：

With GNU grep or similar that support -z and -o options:

grep -zo 'unique_array=([^)]*)' file.txt | tr -dc =, | wc -c

-z - (effectively) treat file as a single line
-o - only output the match
tr -dc =, - strip everything except = and ,
wc -c - count the result

Note: both one- and zero-element arrays will be treated as being size 1. Will return 0 rather than -1 if not found.

CodePudding user response：

This is assuming that always is called "unique_array"

Note: replace array.txt with the filename

if [ $(grep -c "unique_array" array.txt) -eq 0 ]; then
    echo -1
else
    unique_array=($(cat array.txt | tr ',' ' '))
    echo ${#unique_array[@]}
fi

CodePudding user response：

Using sed and declare -a. The test file is like this:

$ cat f
saa

dfsaf

sdgdsag unique_array=(1,2,3,
   4,5,6,
7,8,9,10) sdfgadfg

sdgs
sdgs
sfsaf(sdg)

Testing:

$ declare -a "$(sed -n '/unique_array=(/,/)/s/,/ /gp' f | \
                sed 's/.*\(unique_array\)/\1/;s/).*/)/')"

$ echo ${unique_array[@]}
1 2 3 4 5 6 7 8 9 10

And then you can do whatever you want with ${unique_array[@]}