I would like to understand how to extract all links (starting with www
and ending with .com
) from a text body such as below. Multiple occurrences may or may not occur per line.
cat body.txt
text more-text url="http://www.link1.com">textblabla textbla=textblabla url="http://www.link2.com">textblabla textblabla=textblabla textblabla
url="http://www.link3.com"> textblabla textblablabla=bla
Desired output:
www.link1.com
www.link2.com
www.link3.com
CodePudding user response:
Hope this helps:
myStr='text more-text url="http://www.link1.com">textblabla textbla=textblabla url="http://www.link2.com">textblabla textblabla=textblabla textblabla url="http://www.link3.com"> textblabla textblablabla=bla';
for aString in ${myStr[@]}; do
if [[ ${aString} =~ www.*?com ]]; then
echo ${BASH_REMATCH[0]}
fi
done
CodePudding user response:
Using grep
$ grep -o 'www\.[^.]*\.com' input_file
www.link1.com
www.link2.com
www.link3.com