Home > Software engineering >  Shell - iterate over content of file but do something only the first x lines
Shell - iterate over content of file but do something only the first x lines

Time:08-25

So guys,

I need your help trying to identify the fastest and the most "fault" tolerant solution to my problem. I have a shell script which executes some functions, based on a txt file, in which I have a list of files. The list can contain from 1 file to X files. What I would like to do is iterate over the content of the file and execute my scripts for only 4 items out of the file. Once the functions have been executed for these 4 files, go over to the next 4 .... and keep on doing so until all the files from the list have been "processed".

My code so far is as follows.

#!/bin/bash

number_of_files_in_folder=$(cat list.txt | wc -l)
max_number_of_files_to_process=4
Translated_files=/home/german_translated_files/

while IFS= read -r files
do  
        while [[ $number_of_files_in_folder -gt 0 ]]; do
            i=1
            while [[ $i -le $max_number_of_files_to_process ]]; do
                my_first_function "$files" &                                                  # I execute my translation function for each file, as it can only perform 1 file per execution 
                find /home/german_translator/ -name '*.logs' -exec mv {} $Translated_files \; # As there will be several files generated, I have them copied to another folder
                sed -i "/$files/d" list.txt                                                   # We remove the processed file from within our list.txt file.
                my_second_function                                                            # Without parameters as it will process all the files copied at step 2.
            done
            # here, I want to have all the files processed and don't stop after the first iteration
        done
done < list.txt

Unfortunately, as I am not quite good at shell scripting, I do not know how to structure it so that it won't waste any resources and mostly, to make sure that it "processes" everything from that file. Do you have any advice on how to achieve what I am trying to achieve?

CodePudding user response:

only 4 items out of the file. Once the functions have been executed for these 4 files, go over to the next 4

Seems to be quite easy with xargs.

your_function() {
   echo "Do something with $1 $2 $3 $4"
}
export -f your_function

xargs -d '\n' -n 4 bash -c 'your_function "$@"' _ < list.txt
  • xargs -d '\n' for each line
  • -n 4 take for arguments
  • bash .... - run this command with 4 arguments
  • _ - the syntax is bash -c <script> $0 $1 $2 etc..., see man bash.
  • "$@" - forward arguments
  • export -f your_function - export your function to environment so child bash can pick it up.

I execute my translation function for each file

So you execute your translation function for each file, not for each 4 files. If the "translation function" is really for each file with no inter-file state, consider rather executing 4 processes in parallel with same code and just xargs -P 4.

  • Related