I am trying to loop through files in a directory to find an animal and its value. The command is supposed to only display the animal and total value. For example:
File1 has:
Monkey 11
Bear 4
File2 has:
Monkey 12
If I wanted the total value of monkeys then I would do:
for f in *; do
total=$(grep $animal $f | cut -d " " -f 2- | paste -sd | bc)
done
echo $animal $total
This would return the correct value of:
Monkey 23
However, if there is only one instance of an animal like for example Bear, the variable total doesn't return any value, I only get echoed:
Bear
Why is this the case and how do I fix it?
Note: I'm not allowed to use the find
command.
CodePudding user response:
you could use this little awk
instead of for
grep
cut
paste
bc
:
awk -v animal="Bear" '
$1 == animal { count = $2 }
END { print count 0 }
' *
CodePudding user response:
Comments on OP's question about why code behaves as it does:
total
is reset on each pass through the loop so ...- upon leaving the loop
total
will have the count from the 'last' file processed - in the case of
Bear
the 'last' file processed isFile2
and sinceFile2
does not contain any entries forBear
we gettotal=''
, which is what's printed by theecho
- if the
Bear
entry is moved fromFile1
toFile2
then OP's code should printBear 4
- OP's current code effectively ignores all input files and prints whatever's in the 'last' file (
File2
in this case)
OP's current code generates the following:
# Monkey
Monkey 12 # from File2
# Bear
Bear # no match in File2
I'd probably opt for replacing the whole grep/cut/paste/bc
(4x subprocesses) with a single awk
(1x subprocess) call (and assuming no matches we report 0
):
for animal in Monkey Bear Hippo
do
total=$(awk -v a="${animal}" '$1==a {sum =$2} END {print sum 0}' *)
echo "${animal} ${total}"
done
This generates:
Monkey 23
Bear 4
Hippo 0
NOTES:
- I'm assuming OP's real code does more than
echo
the count to stdout hence the need of thetotal
variable otherwise we could eliminate thetotal
variable and haveawk
print the animal/sum pair directly to stdout - if OP's real code has a parent loop processing a list of animals it's likely possible a single
awk
call could process all of the animals at once; objective being to haveawk
generate the entire set of animal/sum pairs that could then be fed to the looping construct; if this is the case, and OP has some issues implementing a singleawk
solution, a new question should be asked
CodePudding user response:
if there is only one instance of an animal like for example Bear, the variable total doesn't return any value, I only get echoed:
Bear
Why is this the case…
$ cut -d ' ' -f 2- <<< 'abc def'
def
$ cut -d ' ' -f 2- <<< 'abc'
abc
…and how do I fix it?
One solution would be the -s
option to cut
:
-s Suppress lines with no field delimiter characters. Unless specified, lines with no delimiters are passed through unmodified.
$ cut -s -d' ' -f 3 <<< 'abc'
<no output>
CodePudding user response:
Why is this the case
grep
outputs nothing, so nothing is propagated through the pipe and empty string is assigned to total
.
Because total
is reset every loop (total=anything
without referencing previous value), it just has the value for the last file.
how do I fix it?
Do not try to do all at once, just less thing at once.
total=0
for f in *; do
count=$(grep "$animal" "$f" | cut -d " " -f 2-)
total=$((total count)) # reuse total, reference previous value
done
echo "$animal" "$total"
A programmer fluent in shell will most probably jump to AWK for such problems. Remember to check your scripts with shellcheck.
With what you were trying to do, you could do all files at once:
total=$(
{
echo 0 # to have at least nice 0 if animal is not found
grep "$animal" * |
cut -d " " -f 2-
} |
paste -sd |
bc
)
CodePudding user response:
With just bash:
declare -A animals=()
for f in *; do
while read -r animal value; do
(( animals[$animal] = ${animals[$animal]:-0} value ))
done < "$f"
done
declare -p animals
outputs
declare -A animals=([Monkey]="23" [Bear]="4" )
With this approach, you have all the totals for all the animals by processing each file exactly once
CodePudding user response:
$ head File*
==> File1 <==
Monkey 11
Bear 4
==> File2 <==
Monkey 12
==> File3 <==
Bear
Monkey
Using awk and bash array
#!/bin/bash
sumAnimals(){
awk '
{ NF == 1 ? a[$1] : a[$1]=a[$1] $2 }
END{
for (i in a ) printf "[%s]=%d\n",i, a[i]
}
' File*
}
# storing all animals in bash array
declare -A animalsArr="( $(sumAnimals) )"
# show array content
declare -p animalsArr
# getting total from array
echo "Monkey: ${animalsArr[Monkey]}"
echo "Bear: ${animalsArr[Monkey]}"
Output
declare -A animalsArr=([Bear]="5" [Monkey]="24" )
Monkey: 24
Bear: 5