Home > Net >  make the bash script to be faster
make the bash script to be faster

Time:09-16

I have a fairly large list of websites in "file.txt" and wanted to check if the words "Hello World!" in the site in the list using looping and curl.

i.e in "file.txt" :

blabla.com
blabla2.com
blabla3.com

then my code :

#!/bin/bash
put() {
printf "list : "
read list
run=$(cat $list)
}
put
scan_list() {
for run in $(cat $list);do
if [[ $(curl -skL ${run}) =~ "Hello World!" ]];then
printf "${run} Hello World! \n"
else
printf "${run} No Hello:( \n"
fi
done
}
scan_list

this takes a lot of time, is there a way to make the checking process faster?

CodePudding user response:

Use xargs.

% tr '\12' '\0' < file.txt | \
    xargs -0 -I {} -t -P 3 ./cme.sh {}
./cme.sh https://aonprd.com/
./cme.sh https://www.d20pfsrd.com/
./cme.sh http://news.com
https://aonprd.com/ No Hello:(
https://www.d20pfsrd.com/ No Hello:(
http://news.com No Hello:(
  • Use tr to convert returns in the file.txt to nulls (\0).
  • Pass through xargs with -0 option to parse by nulls.
  • The -I {} option sets the replacement string.
  • The -t option is debugging, it prints the command before it is ran.
  • Finally, the -P 3 option runs 3 parallel.

The script does the curl and parsing for Hello World or not:

#!/bin/sh
if curl -skL ${1} | grep -q "Hello World!"; then echo "${1} Hello World\!"; exit; fi
echo "${1} No Hello:("

CodePudding user response:

This should be considered a comment on James Risner's answer, not an answer on its own, and will be deleted after being seen by its intended audience.

Taking that same approach, but implementing it without the separate cme.sh would look like:

tr '\12' '\0' < file.txt | \
  xargs -0 -r -n 1 -t -P 3 sh -c '
    if curl -skL "$1" | grep -q "Hello World!"; then
      echo "$1 Hello World!"
      exit
    fi
    echo "$1 No Hello:("
  ' _

Note:

  • The ${1} on the curl command line is now quoted. Either "$1" or "${1}" is equally correct; without quotes, the value is subject to word-splitting and glob expansion.
  • There's no -I given to xargs; instead, we're passing -n 1 to specify only one item per copy of sh. (If you wanted a number larger than 1, to spread out the inefficiency of starting new copies of sh, you might change the script to a loop: sh -c 'for arg; do if curl -skL "$arg" | grep ... -- the default object of any loop is "$@", so arg will have each argument assigned in turn; mind, you want to keep -n small enough that -P will be able to split the overall workload into enough batches to keep the system's CPUs busy).
  • Passing the -r argument to xargs ensures graceful handling for the case where xargs doesn't find any items in file.txt at all.
  • Escaping a ! in double quotes is only necessary in interactive shells with history expansion enabled. Because our entire script is inside a single-quoted string, it's not necessary here even if the outer shell is interactive; and the inner shell -- the copy of sh started by xargs -- is definitely noninteractive and so has history expansion turned off.
  •  Tags:  
  • bash
  • Related