Home > Software design >  Why, after removing duplicates, the array length is 0?
Why, after removing duplicates, the array length is 0?

Time:06-18

I tried to print the "unique_words", and it prints "hostname1 hostname2 hostname3", which is correct. However, when I check its size, it is 1 instead of 3. Why that happened?

#!/bin/bash
#Define the string value
text="hostname1 hostname2 hostname2 hostname3"
RANDOM=$$$(date  %s)
declare -i x=1

# Set space as the delimiter
IFS=' '
#Read the split words into an array based on space delimiter
read -a hostArray <<< "$text"
unique_words=($(echo ${hostArray[@]} | tr ' ' '\n' | sort | uniq))
echo ${#unique_words[@]}

CodePudding user response:

Depending on word splitting and IFS to convert strings to arrays is difficult to do safely, and is best avoided. Consider this (Shellcheck-clean) alternative:

#!/bin/bash -p

words=( hostname1 hostname2 hostname2 hostname3 )

sort_output=$(printf '%s\n' "${words[@]}" | sort -u)
readarray -t unique_words <<<"$sort_output"

declare -p unique_words
  • In Bash code, it's much better to use arrays (like words) instead of strings to hold lists. In general, there are significant difficulties both in looping over lists in strings and in converting them to arrays. Using only arrays is much easier.
  • echo is not a reliable way to output variable data. Use printf instead. See the accepted, and excellent, answer to Why is printf better than echo?.
  • readarray (aka mapfile) is a reliable and efficient way to convert lines of text to arrays without using word splitting.
  • declare -p is an easy and reliable way to display the value, and attributes, of any Bash variable (including arrays and associative arrays). echo "$var" is broken in general, and the output of printf '%s\n' "$var" can hide important details.

CodePudding user response:

When you assign the output of uniq to the array unique_words, IFS is still set to a space. However, the output of uniq consists of several lines, i.e. words separated by newline characters, not spaces. Therefore, when you define your array, you get one single multiline string.

If you would do a

IFS=$'\n'
unique_words=($(echo ${hostArray[@]} | tr ' ' '\n' | sort | uniq))

you would get 3 array elements.

Alternatively, you could save the old value of IFS before changing it:

oldifs=$IFS
IFS=' '
read ....
IFS=$oldifs
unique_words=....
  • Related