I want to sort 2 arrays at the same time. The arrays are the following: wordArray
and numArray
. Both are global.
These 2 arrays contain all the words (without duplicates) and the number of the appearances of each word from a text file.
Right now I am using Bubble Sort to sort both of them at the same time:
# Bubble Sort function
function bubble_sort {
local max=${#numArray[@]}
size=${#numArray[@]}
while ((max > 0))
do
local i=0
while ((i < max))
do
if [ "$i" != "$(($size-1))" ]
then
if [ ${numArray[$i]} \< ${numArray[$((i 1))]} ]
then
local temp=${numArray[$i]}
numArray[$i]=${numArray[$((i 1))]}
numArray[$((i 1))]=$temp
local temp2=${wordArray[$i]}
wordArray[$i]=${wordArray[$((i 1))]}
wordArray[$((i 1))]=$temp2
fi
fi
((i = 1))
done
((max -= 1))
done
}
#Calling Bubble Sort function
bubble_sort "${numArray[@]}" "${wordArray[@]}"
But for some reason it won't sort them properly when large arrays are in place.
Does anyone knows what's wrong with it or an other approach to sort the words with the corresponding number of appearance with or without arrays?
This:
wordArray = (because, maybe, why, the)
numArray = (5, 12, 20, 13)
Must turn to this:
wordArray = (why, the, maybe, because)
numArray = (20, 13, 12, 5)
Someone recommended to write the two arrays side by side in a text file and sort the file.
How will it work for this input:
1 Arthur
21 Zebra
to turn to this output:
21 Zebra
1 Arthur
CodePudding user response:
Assuming the arrays no not contain tab character or newline character, how about:
#!/bin/bash
wordArray=(why the maybe because)
numArray=(20 13 12 5)
tmp1=$(mktemp tmp.XXXXXX) # file to be sorted
tmp2=$(mktemp tmp.XXXXXX) # sorted result
for (( i = 0; i < ${#wordArray[@]}; i )); do
echo "${numArray[i]}"$'\t'"${wordArray[i]}" # write the number and word delimited by a tab character
done > "$tmp1"
sort -nrk1,1 "$tmp1" > "$tmp2" # sort the file by number in descending order
while IFS=$'\t' read -r num word; do # read the lines splitting by the tab character
numArray_sorted =("$num") # add the number to the array
wordArray_sorted =("$word") # add the word to the array
done < "$tmp2"
rm -- "$tmp1" # unlink the temp file
rm -- "$tmp2" # same as above
echo "${wordArray_sorted[@]}" # same as above
echo "${numArray_sorted[@]}" # see the result
Output:
why the maybe because
20 13 12 5
If you prefer not to create temp files, here is the process substitution
version, which will run faster without writing/reading temp files.
#!/bin/bash
wordArray=(why the maybe because)
numArray=(20 13 12 5)
while IFS=$'\t' read -r num word; do
numArray_sorted =("$num")
wordArray_sorted =("$word")
done < <(
sort -nrk1,1 < <(
for (( i = 0; i < ${#wordArray[@]}; i )); do
echo "${numArray[i]}"$'\t'"${wordArray[i]}"
done
)
)
echo "${wordArray_sorted[@]}"
echo "${numArray_sorted[@]}"
Or simpler (using the suggestion by KamilCuk):
#!/bin/bash
wordArray=(why the maybe because)
numArray=(20 13 12 5)
while IFS=$'\t' read -r num word; do
numArray_sorted =("$num")
wordArray_sorted =("$word")
done < <(
paste <(printf "%s\n" "${numArray[@]}") <(printf "%s\n" "${wordArray[@]}") | sort -nrk1,1
)
echo "${wordArray_sorted[@]}"
echo "${numArray_sorted[@]}"
CodePudding user response:
You need numeric sort for the numbers. You can sort an array like this:
mapfile -t wordArray <(printf '%s\n' "${wordArray[@]}" | sort -n)
But what you actually need is something like:
for num in "${numArray[@]}"; do
echo "$num: ${wordArray[j ]}"
done |
sort -n k1,1
But, earlier in the process, you should have used only one array, where the word and frequency (or vice versa) are key value pairs. Then they always have a direct relationship, and can be printed similarly to the for loop above.