I've got a file that looks like this:
a 12345
b 3456
c 45678
and i've got bash array of strings:
mylist=("a" "b")
What I want to do is to sum numbers in second column but only for rows where first column value (aka "a" or "b") is present in mylist
.
My not-working code:
cat myfile.txt | awk -F'\t' '{BEGIN{sum=0} {if ($1 in ${mylist[@]}) sum =$2} END{print sum}}'
Expected result is 12345 3456=15801. I understand that problem is in if-statement but can't figure out how to rearrange this code to work.
CodePudding user response:
Doing it in pure bash by making the elements of the original array keys in an associative one:
#!/usr/bin/env bash
mylist=(a b)
# Use the elements of the array as the keys in an associative array
declare -A keys
for elem in "${mylist[@]}"; do
keys[$elem]=1
done
declare -i sum=0
# Read the lines on standard input
# For example, ./sum.sh < input.txt
while read -r name num; do
# If the name is a key in the associative array, add to the sum
if [[ -v keys[$name] ]]; then
sum =$num
fi
done
printf "%d\n" "$sum"
CodePudding user response:
There's no good reason to make awk read the array in the first place. Let join
quickly pick out the matching lines -- that's what it's specialized to do.
And if in real life your array and input file keys are guaranteed to be sorted as they are in the example, you can take the sort
uses out of the code below.
# Cautious code that doesn't assume input sort order
LC_ALL=C join -1 1 -2 1 -o1.2 \
<(LC_ALL=C sort <myfile.txt) \
<(printf '%s\n' "${mylist[@]}" | LC_ALL=C sort) \
| awk '{ sum = $1 } END { print sum }'
...or...
# Fast code that requires both the array and the file to be pre-sorted
join -1 1 -2 1 -o1.2 myfile.txt <(printf '%s\n' "${mylist[@]}") \
| awk '{ sum = $1 } END { print sum }'
CodePudding user response:
One method would be:
#!/bin/bash
mylist=(a b)
awk '
FNR==NR { a[$1]; next }
$1 in a { sum = $2 }
END { print sum }
' <(printf '%s\n' "${mylist[@]}") file
Note that, when initializing an array in bash
, array elements are separated by whitespaces, not commas.