Home > Software engineering >  1. How to use the input not including the first one 2.Using grep and sed to find the pattern entered
1. How to use the input not including the first one 2.Using grep and sed to find the pattern entered

Time:10-14

The command that I'm making wants the first input to be a file and search how many times a certain pattern occurs within the file, using grep and sed. Ex:

$ cat file1
oneonetwotwotwothreefourfive

Intended output:

$ ./command file1 one two three
one 2
two 3
three 1

The problem is the file does not have any lines and is just a long list of letters. I'm trying to use sed to replace the pattern I'm looking for with "FIND" and move the list to the next line and this continues until the end of file. Then, use $grep FIND to get the line that contains FIND. Finally, use wc -l to find a number of lines. However, I cannot find the option to move the list to the next line

Ex:

$cat file1
oneonetwosixone

Intended output:

FIND
FIND
twosixFIND

Another problem that I've been having is how to use the rest of the input, not including the file.

Failed attempt:

file=$1
for PATTERN in 2 3 4 5 ... N
do
variable=$(sed 's/$PATTERN/find/g' $file | grep FIND $file | wc -l)
echo $PATTERN $variable
exit

Another failed attempt:

file=$1
PATTERN=$($2,$3 ... $N)
for PATTERN in $*
do variable=$(sed 's/$PATTERN/FIND/g' $file | grep FIND $file | wc-1)
echo $PATTERN $variable
exit

Any suggestions and help will be greatly appreciated. Thank you in advance.

CodePudding user response:

Non-portable solution with GNU grep:

file="$1"
shift

for pattern in "$@"; do
    echo "$pattern" $(grep -o -e "$pattern" <"$file" | wc -l)
done

If you want to use sed and your "patterns" are actually fixed strings (which don't contain characters that have special meaning to sed), you could do something like:

file="$1"
shift

for pattern in "$@"; do
    echo "$pattern" $(
        sed "s/$pattern/\n&\n/g" "$file" |\
        grep -e "$pattern" | wc -l
    )
done

Your code has several issues:

  • you should quote use of variables to avoid splitting: (eg. "$1" instead of $1)
  • don't use ALLCAPS variable names - they are reserved for use by the shell
  • if you put a string in single-quotes, variable expansion does not happen
  • if you give grep a file, it won't read standard input
  • your for loop has no terminating done

CodePudding user response:

This might work for you (GNU bash,sed and uniq):

f(){ local file=$1;
     shift;
     local args="$@";
     sed -E 's/'${args// /|}'/\n&\n/g
             s/(\n\S )\n\S /\1/g
             s/\n /\n/g
             s/.(.*)/echo "\1"|uniq -c/e
             s/ *(\S ) (\S )/\2 \1/mg' $file; }

Separate arguments into file and remaining arguments.

Apply arguments as alternation within a sed substitution command which splits words into lines separated by a newline either side.

Remove unwanted words and unwanted newlines.

Evaluate the manufactured file within a sed substitution using the uniq command with the -c option.

Rearrange the output and print the result.

CodePudding user response:

The problem is the file does not have any lines

Great! So the problem reduces to putting newlines.

func() {
     file=$1
     shift
     rgx=$(printf "%s\\|" "$@" | sed 's@\\|$@@');
     # put the newline between words
     sed 's/\('"$rgx"'\)/&\n/g' "$file" |
     # it's just standard here
     sort | uniq -c | 
     # filter only input - i.e. exclude fourfive
     grep -xf <(printf " *[0-9]\  %s\n" "$@")
};
func <(echo oneonetwotwotwothreefourfive) one two three

outputs:

  2 one
  1 three
  3 two
  • Related