Home > Software design >  How to use xargs to execute python script in parallel fashion that takes input from file?
How to use xargs to execute python script in parallel fashion that takes input from file?

Time:02-22

I have a python script that takes input from from stdin:

from urllib.parse import urlparse
import sys
import asyncio

from wapitiCore.main.wapiti import Wapiti, logging


async def scan(url: str):
    wapiti = Wapiti(url)
    wapiti.set_max_scan_time(30)
    wapiti.set_max_links_per_page(20)
    wapiti.set_max_files_per_dir(10)

    wapiti.verbosity(2)
    wapiti.set_color()
    wapiti.set_timeout(20)
    wapiti.set_modules("xss")
    wapiti.set_bug_reporting(False)

    parts = urlparse(url)
    wapiti.set_output_file(f"/tmp/{parts.scheme}_{parts.netloc}.json")
    wapiti.set_report_generator_type("json")

    wapiti.set_attack_options({"timeout": 20, "level": 1})

    stop_event = asyncio.Event()
    await wapiti.init_persister()
    await wapiti.flush_session()
    await wapiti.browse(stop_event, parallelism=64)
    await wapiti.attack(stop_event)

if __name__ == "__main__":
    asyncio.run(scan(sys.argv[1]))

How can I use xargs to run this script on multiple URL's from a file in parallel fashion?

urls.txt

https://jeboekindewinkel.nl/
https://www.codestudyblog.com/

CodePudding user response:

I believe a bash file something like this would work.

cat urls.txt | while read line
do
python scriptName.py $line &
done

CodePudding user response:

This would execute your script concurrently based on the number of cores of processors.

cat urls.txt | xargs -L1 -P0 python script.py

Reference

-P maxprocs
    Parallel mode: run at most maxprocs invocations of utility at once.
    If maxprocs is set to 0, xargs will run as many processes as possible.

-L number
    Call utility for every number non-empty lines read.  A line ending with a
    space continues to the next non-empty line.  If EOF is reached and fewer
    lines have been read than number then utility will be called with the
    available lines.  The -L and -n options are mutually-exclusive; the last
    one given will be used.
  • Related