How to use xargs to execute python script in parallel fashion that takes input from file?-CodePudding

I have a python script that takes input from from stdin:

from urllib.parse import urlparse
import sys
import asyncio

from wapitiCore.main.wapiti import Wapiti, logging


async def scan(url: str):
    wapiti = Wapiti(url)
    wapiti.set_max_scan_time(30)
    wapiti.set_max_links_per_page(20)
    wapiti.set_max_files_per_dir(10)

    wapiti.verbosity(2)
    wapiti.set_color()
    wapiti.set_timeout(20)
    wapiti.set_modules("xss")
    wapiti.set_bug_reporting(False)

    parts = urlparse(url)
    wapiti.set_output_file(f"/tmp/{parts.scheme}_{parts.netloc}.json")
    wapiti.set_report_generator_type("json")

    wapiti.set_attack_options({"timeout": 20, "level": 1})

    stop_event = asyncio.Event()
    await wapiti.init_persister()
    await wapiti.flush_session()
    await wapiti.browse(stop_event, parallelism=64)
    await wapiti.attack(stop_event)

if __name__ == "__main__":
    asyncio.run(scan(sys.argv[1]))

How can I use xargs to run this script on multiple URL's from a file in parallel fashion?

urls.txt

https://jeboekindewinkel.nl/
https://www.codestudyblog.com/

CodePudding user response：

I believe a bash file something like this would work.

cat urls.txt | while read line
do
python scriptName.py $line &
done

CodePudding user response：

This would execute your script concurrently based on the number of cores of processors.

cat urls.txt | xargs -L1 -P0 python script.py

Reference

-P maxprocs
    Parallel mode: run at most maxprocs invocations of utility at once.
    If maxprocs is set to 0, xargs will run as many processes as possible.

-L number
    Call utility for every number non-empty lines read.  A line ending with a
    space continues to the next non-empty line.  If EOF is reached and fewer
    lines have been read than number then utility will be called with the
    available lines.  The -L and -n options are mutually-exclusive; the last
    one given will be used.