Home > Blockchain >  How to grep a particular keyword from a long phrase of text using Unix command?
How to grep a particular keyword from a long phrase of text using Unix command?

Time:09-25

I am bit stuck with this, not getting how to find the keyword in an entire file and matching, the entire word should be printed.

Suppose if I want to search the keyword: file_demo

The only matching word in the data below is file_demo_2021.txt

The output should be: file_demo_2021.txt

Below is my data in file:

For several generations, stories from Africa have traditionally been passed down by word of mouth. Often, after a hard day’s work, the adults would gather the children together by moonlight, around a village fire and tell stories. This was traditionally called 'Tales by Moonlight'. Usually, the stories are meant to prepare young people for life, and so each story taught a lesson or moral. file_demo_2021.txt In the African folktales, the stories reflect the culture where diverse types of animals abound. The animals and birds are often accorded human attributes, so it is not uncommon to find animals talking, singing, or demonstrating other human characteristics such as greed, jealousy, honesty, etc. The setting in many of the stories exposes the reader to the land form and climate within that region of Africa. References are often made to different seasons such as the 'dry' or 'rainy' season and their various effects on the surrounding vegetation and animal life.

In this data this is one file name, I need to print it as below:

Output :

file_demo_2021.txt

CodePudding user response:

GNU AWK solution. Let file.txt content be

For several generations, stories from Africa have traditionally been passed down by word of mouth. Often, after a hard day’s work, the adults would gather the children together by moonlight, around a village fire and tell stories. This was traditionally called 'Tales by Moonlight'. Usually, the stories are meant to prepare young people for life, and so each story taught a lesson or moral. file_demo_2021.txt In the African folktales, the stories reflect the culture where diverse types of animals abound. The animals and birds are often accorded human attributes, so it is not uncommon to find animals talking, singing, or demonstrating other human characteristics such as greed, jealousy, honesty, etc. The setting in many of the stories exposes the reader to the land form and climate within that region of Africa. References are often made to different seasons such as the 'dry' or 'rainy' season and their various effects on the surrounding vegetation and animal life.

then

awk 'BEGIN{RS="[[:space:]]"}/file_demo/' file.txt

output

file_demo_2021.txt

Explanation: I inform AWK to use any whitespace as row seperator (RS) thus I get one word per row. If said row match file_demo default action is taken - printing.

(tested in GNU Awk 5.0.1)

CodePudding user response:

Using GNU sed.

EDIT

If the file is too large, this may not be efficient as the whole file is read into memory.

sed -Ez 's/.* ([a-z]*file_demo[^, ]*).*/\1\n/'

CodePudding user response:

Just use grep

grep -oP "file_demo.*\.txt" filename

to find any file_demo*.txt

grep -oP "file_demo.*[\s]" filename

to find any file_demo.* until word separator (space, commas, end of string, etc)

CodePudding user response:

With GNU awk using patsplit():

awk 'patsplit($0,vals,/([[:alpha:]] _){2}[[:digit:]] [.][[:alpha:]] /) { for (i in vals) print vals[i] }' file
file_demo_2021.txt
  • Related