Home > other >  combining all files that contains the same word into a new text file with leaving new lines between
combining all files that contains the same word into a new text file with leaving new lines between

Time:01-16

it is my first question here. I have a folder called "materials", which has 40 text files in it. I am basically trying to combine the text files that contain the word "carbon"(both in capitalized and lowercase form)in it into a single file with leaving newlines between them. I used " grep -w carbon * " to identify the files that contain the word carbon. I just don't know what to do after this point. I really appreciate all your help!

CodePudding user response:

grep -il carbon materials/*txt | while read line; do 
    echo ">> Adding $line";
    cat $line >> result.out; 
    echo >> result.out; 
done

Explanation

  • grep searches the strings in the files. -i ignores the case for the searched string. -l prints on the filename containing the string
  • while command loops over the files containing the string
  • cat with >> appends to the results.out
  • echo >> adds new line after appending each files content to result.out

Execution

$ ls -1 materials/*.txt
materials/1.txt
materials/2.txt
materials/3.txt

$ grep -i carbon materials/*.txt
materials/1.txt:carbon
materials/2.txt:CARBON

$ grep -irl carbon materials/*txt | while read line; do      echo ">> Adding $line";     cat $line >> result.out;      echo >> result.out;  done
>> Adding materials/1.txt
>> Adding materials/2.txt

$ cat result.out
carbon

CARBON

CodePudding user response:

Try this (assuming your filenames don't contain newline characters):

grep -iwl carbon ./* | while IFS= read -r f; do cat "$f"; echo; done > combined

Note that it will leave a trailing newline in the file combined.
If it is possible that your filenames may contain newline characters and your shell is bash, then:

grep -iwlZ carbon ./* | while IFS= read -r -d '' f; do cat "$f"; echo; done > combined
  •  Tags:  
  • Related