I have two arrays, one with a list of domains and another with a list of blacklisted domains, which I want to compare. The idea is that the script is going to do X if the domain isn't blacklisted.
The script works fine if I make an array like this:
DOMAINS=(
'domain1.no 443'
'domain2.no 443'
'domain3.no 443'
'domain4.no 443'
'domain5.no 443'
)
BLACKLIST=(
'domain1.no'
'domain3.no'
'domain5.no'
)
Output:
domain1.no is blacklisted
domain2.no is NOT blacklisted
domain3.no is blacklisted
domain4.no is NOT blacklisted
domain5.no is blacklisted
But if I create the arrays by importing domains from domains.txt/blacklist.txt with mapfile -t then the script doesn't work. Like this:
mapfile -t DOMAINS < domains.txt
mapfile -t BLACKLIST < blacklist.txt
domains.txt contents:
domain1.no 443
domain2.no 443
domain3.no 443
domain4.no 443
domain5.no 443
blacklist.txt contents:
domain1.no
domain3.no
domain5.no
Output:
domain1.no is NOT blacklisted
domain2.no is NOT blacklisted
domain3.no is NOT blacklisted
domain4.no is NOT blacklisted
domain5.no is blacklisted
This is the rest of the script:
function test_function ()
{
host=$1
is_blacklisted=0
for domain in "${BLACKLIST[@]}"; do
if [[ " $host " == *" $domain "* ]]; then
is_blacklisted=1
fi
done
if [ $is_blacklisted == 1 ]; then
printf "%s\n" "$host is blacklisted"
elif [ $is_blacklisted == 0 ]; then
printf "%s\n" "$host is NOT blacklisted"
fi
}
for domain in "${DOMAINS[@]}"; do
test_function $domain
done
My question is, what is the reason that the comparison doesn't work properly when using the mapfile array?
I'm very, very new to bash scripting (and to this site), my code might not bee too good and obvious answers will probably not be so obvious to me!
'443' is added to the DOMAINS array for another script that checks SSL, which is why it's there but not used in this script. I wanted to use these .txt files so that I don't have to update each scripts array manually but instead I can update the .txt file.
If it matters, I'm using Ubuntu/WSL from Microsoft app store.
CodePudding user response:
As discussed in the comments, the most likely issue was a \r
character left at the end of each line. Here’s a possible solution that removes such \r
characters. Also, it preprocesses the blacklist into an associative array for more efficient lookups.
#!/bin/bash
set -euo pipefail
if (($# != 2)); then
echo "Usage: ${0} <blacklist file> <domains file>"
exit 1
fi
is_blacklisted() {
(($# == 2)) # crash on wrong number of arguments
local -nr linenum_map="$1" # array passed by name reference
local -ir linenum='linenum_map["$2"]' # integer evaluation
if ((linenum)); then
printf '%s is blacklisted on line %d\n' "$2" "$((linenum))"
return 1 # easier for callers than output parsing
else
printf '%s is NOT blacklisted\n' "$2"
fi
}
readarray -t input < "$1"
declare -Ai blacklist
for i in "${!input[@]}"; do
((blacklist["${input[i]%$'\r'}"] = i 1)) # domain without \r -> line
done
readarray -t input < "$2"
domains=("${input[@]% *}") # remove everything after space (maybe also \r)
for domain in "${domains[@]}"; do
is_blacklisted 'blacklist' "$domain" || : # don't crash on error
done
It seems to work reasonably with the input examples provided:
$ /tmp/blacklist.sh /tmp/blacklist.txt /tmp/domains.txt
domain1.no is blacklisted on line 1
domain2.no is NOT blacklisted
domain3.no is blacklisted on line 2
domain4.no is NOT blacklisted
domain5.no is blacklisted on line 3