Home > Mobile >  Accept filename as argument and calculate repeated words along with count
Accept filename as argument and calculate repeated words along with count

Time:03-26

I need to find the number or repeated characters from a text file and need to pass filename as argument.

Example: test.txt data contains

Zoom

Output should be like:

z 1
o 2
m 1

I need a command that will accept filename as argument and then lists the number of characters from that file. In my example I have a test.txt which has zoom word. So the output will be like how many times each letter has repeated.

My attempt:

vi test.sh

#!/bin/bash
FILE="$1" --to pass filename as argument
sort file1.txt | uniq -c --to count the number of letters

CodePudding user response:

Just a guess?

cat test.txt |
tr '[:upper:]' '[:lower:]' |
fold -w 1 |
sort |
uniq -c |
awk '{print $2, $1}'
m 1
o 2
z 1

CodePudding user response:

#!/bin/bash
#get the argument for further processing
inputfile="$1"

#check if file exists
if [ -f $inputfile ]
then
    #convert file to a usable format
                    #convert all characters to lowercase
                                                 #put each character on a new line
                                                                            #output to temporary file
    cat $inputfile | tr '[:upper:]' '[:lower:]' | sed -e 's/\(.\)/\1\n/g' > tmp.txt
    #loop over every character from a-z
    for char in {a..z}
    do
        #count how many times a character occurs
        count=$(grep -c "$char" tmp.txt)
        #print if count > 0
        if [ "$count" -gt "0" ]
        then
            echo -e "$char" "$count"
        fi
    done
    rm tmp.txt
else
    echo "file not found!"
    exit 1
fi

CodePudding user response:

Suggesting awk script that count all kinds of chars:

awk '
BEGIN{FS = ""}  # make each char a field
{
  for (i = 1; i <= NF; i  ) { # iteratre over all fields in line
      charsArr[$i]; # count each field occourance in array
  }
}
END {
  for (char in charsArr) { # iterrate over chars array
    printf("= %s\n", charsArr[char], char);  # cournt char-occourances and the char
  }
}' |sort -n 

Or in one line:

awk '{for(i=1;i<=NF;i  )  arr[$i]}END{for(char in arr)printf("= %s\n",arr[char],char)}' FS="" input.1.txt|sort -n
  • Related