#!/bin/bash
data_dir=./all
for file_name in "$data_dir"/*
do
echo "$file_name"
python process.py "$file_name"
done
For example, this script processes the files sequentially in a directory in a 'for' loop. Is it possible to start multiple process.py instances to process files concurrently? I want to do this in a shell script.
CodePudding user response:
It's better to use os.listdir and subprocess.Popen to start new processes.
CodePudding user response:
I have another possibility for you, if still needed. It uses the screen
command to create a new detached process with the supplied command.
Here is an example:
#!/bin/bash
data_dir=./all
for file_name in "$data_dir"/*
do
echo "$file_name"
screen -dm python process.py "$file_name"
done
CodePudding user response:
With GNU Parallel, like this:
parallel python process.py {} ::: all/*
It will run N jobs in parallel, where N is the number of CPU cores you have, or you can specify -j4
to run on just 4, for example.
Many, many options for:
- logging,
- splitting/chunking inputs,
- tagging/separating output,
- staggering job starts,
- massaging input parameters,
- fail and retry handling,
- distributing jobs and data to other machines
- and so on...
Try putting [gnu-parallel]
in the StackOverflow search box.