Home > Net >  Check if a word from one file exists in another file and print the matching line
Check if a word from one file exists in another file and print the matching line

Time:12-30

I have a file which is having some specific words. I have another file having the URLs which contains that word from file1.

I would like to print url if each word in file1 matches with file2. If word is not found in file2 then return "no matching"

I tried with Awk and grep and used if conditions also. But did not get expected results.

File1:

abc 
Def
XYZ

File2:

Https://gitlab.private.com/apm-team/mi_abc_linux1.git
Https://gitlab.private.com/apm-team/mi_abc_linux2.git
Https://gitlab.private.com/apm-team/mi_abc_linux3.git
Https://gitlab.private.com/apm-team/mi_xyz_linux1.git
Https://gitlab.private.com/apm-team/mi_xyz_linux2.git
Https://gitlab.private.com/apm-team/mi_def_linux1.git
Https://gitlab.private.com/apm-team/mi_def_linux2.git

Output can be like:

abc:
Https://gitlab.private.com/apm-team/mi_abc_linux1.git
Https://gitlab.private.com/apm-team/mi_abc_linux2.git
Xyz:
Https://gitlab.private.com/apm-team/mi_xyz_linux1.git

Etc..

Tried:

file=/bin/file1.txt

for i in `cat $file1`;
do
a=$i
echo "$a:" | awk '$repos.txt ~ $a {printf $?}'
done

Tried some other ways like if condition with grep and all... but no luck.

CodePudding user response:

Your attempt is pretty far from the mark. Probably learn the basics of the shell and Awk before you proceed.

Here is a simple implementation which avoids reading lines with for.

while IFS='' read -r word; do
    echo "$word:"
    grep -F "$word" File2
done <File1

A better design is perhaps to prefix the match(es) before each output line, and only loop over the input file once.

awk 'NR==FNR { w[  n] = $0; next }
    { m = "" 
      for (a in w) if ($0 ~ w[a]) m = m (m ? "," : "") w[a]
      if (m) print m ":" $0 }' File1 File2

In brief, we collect the search words in the array w from the first input file. When reading the second input file, we collect matches on all the search words in m; if m is non-empty, we print its value followed by the input line which matched.

CodePudding user response:

You appear to want case-insensitive matching.

An awk solution:

$ cat <<'EOD' >file1
abc
Def
XYZ
missing
EOD
$ cat <<'EOD' >file2
Https://gitlab.private.com/apm-team/mi_abc_linux1.git
Https://gitlab.private.com/apm-team/mi_abc_linux2.git
Https://gitlab.private.com/apm-team/mi_abc_linux3.git
Https://gitlab.private.com/apm-team/mi_xyz_linux1.git
Https://gitlab.private.com/apm-team/mi_xyz_linux2.git
Https://gitlab.private.com/apm-team/mi_def_linux1.git
Https://gitlab.private.com/apm-team/mi_def_linux2.git
EOD
$ awk '
    # create lowercase versions
    {
        lc = tolower($0)
    }

    # loop over lines of file1
    # store search strings in array
    # key is search string, value will be results found
    NR==FNR {
        h[lc]
        next
    }

    # loop over lines of file2
    # if search string found, append line to results
    {
        for (s in h)
            if (lc ~ s)
                h[s] = h[s]"\n"$0
    }

    # loop over seearch strings and print results
    # if no result, show error message
    END {
        for (s in h)
            print s":"( h[s] ? h[s] : "\nno matching" )
    }
' file1 file2
missing:
no matching
def:
Https://gitlab.private.com/apm-team/mi_def_linux1.git
Https://gitlab.private.com/apm-team/mi_def_linux2.git
abc:
Https://gitlab.private.com/apm-team/mi_abc_linux1.git
Https://gitlab.private.com/apm-team/mi_abc_linux2.git
Https://gitlab.private.com/apm-team/mi_abc_linux3.git
xyz:
Https://gitlab.private.com/apm-team/mi_xyz_linux1.git
Https://gitlab.private.com/apm-team/mi_xyz_linux2.git
$
  • Related