Home > Enterprise >  read non-constant number of files in bash (multiple files in bash)
read non-constant number of files in bash (multiple files in bash)

Time:03-23

I have an unknown number of log/text files I need to read from and sum some variable in them using Bash script. The problem is my variables reset after each iteration in a loop. i saw a similar problem, but with a definite number of files (only two files).

I manage to find the number of files with ls. I read from each file using cat. I can't save and add the variables properly. This is what I try to do:

logfiles=($(ls -l ~ | grep -o '[^ ] $' | grep ${LOG_NAME}log))
num_of_logs=${#logfiles[@]}
for LOG_NUM in `seq 0 $(($num_of_logs-1))`
    do (
         SPEED=`cat $LOG | grep sum_sent -A7 | grep bits_per_second | tr "," " " | awk '{ SUM =$NF } END { print SUM } '`
         BITS=$(printf '%.0f' ${SPEED})
         GIGABITS=$( bc -l <<< "scale=2; $BITS/1000000000" )
         GIGASUM=$( echo "$GIGABITS $GIGASUM" | bc )
        )
    done `

Tried: Separating the "cat" command into a seperate loop with an array and input each result to a different index. yielded same result and the array came out empty as well.

CodePudding user response:

The problem is my variables reset after each iteration in a loop.

Yes, because you enclose the loop body in parentheses (). Everything within runs in a subshell, and in particular, variable definitions and changes within affect only the subshell, which is new on each iteration. If you want everything to run in the same shell then just omit those.

Additionally,

  • You don't set an initial value for variable GIGASUM, so it is initially empty (not 0).

  • you have a stray backtick (`) at the end of the last line.

  • the $() alternative is far more readable than backticks. Since you're already assuming a shell that supports that form, you should use it consistently.

  • Where variables' values are not fully under your control, and often even when they are fully under your control, expansion of their values should be quoted.

  • You should not attempt to parse the output of ls. Globbing is sometimes altogether sufficient, and find serves more generally. Combine with the stat command in special cases. It especially makes no sense to parse the output of ls -l to remove everything that plain ls does not have.

  • You never set variable LOG

  • It's unclear why you are iterating over the number of logs instead of directly over the logfile names.

  • Don't cat a file into another command when you could just specify the file to the second command by name.

  • Use whitespace and line breaks to make your code clearer. There are several places where you can break lines naturally, without escaping the newline.

I'm confident that there are more improvements available, but this covers all of the above:

# Supposing that you actually need the logfiles array for something later,
# after the loop (otherwise, you can put the glob directly into the `for` command):
logfiles=(~/*"${LOG_NAME}"log*)

GIGASUM=0
for LOG in "${logfiles[@]}"; do
    SPEED=$(grep sum_sent -A7 "$LOG" |
      grep bits_per_second |
      tr "," " " |
      awk '{ SUM =$NF } END { print SUM } ')
    BITS=$(printf '%.0f' ${SPEED})
    GIGABITS=$( bc -l <<< "scale=2; $BITS/1000000000" )
    GIGASUM=$( echo "$GIGABITS $GIGASUM" | bc )
done
  • Related