how to speed up tail and head in bash-CodePudding

I have a giant text file called stock_messages that looks like this:

H: TSLA
A: id1, 100 
E: id1, 20
F: id2, 250
...
H: AAPL
A: id1, 100
A: id2, 20
E: id1, 80
A: id2, 10
...

What I want to do is to create a separate text file with messages for each stock (e.g. AAPL.txt, TSLA.txt, etc).

I wrote a bash script so that

start=-1
stock_name=""
grep -n -i '^H' $file | awk -F "[:,]" {'print $1, $NF'} | while read -r line; do
  line_number=$(echo $line | awk -F " " {'print $1'})
  if [[ "$start" -gt 0 ]]
  then
    tail -n " start" $file | head -n "$(($line_number-$start))" > "./data/${stock_name}.txt"
    echo "saved $stock_name data!"
  fi
  start=$line_number
  stock_name=$(echo $line | awk -F " " {'print $2'})
done

Basically I'm taking the line numbers where H's are, and using tail and head to take those lines out and save it into separate file.

The script runs pretty fast initially but it gets really slow very quickly, and I'm not sure why.

Any suggestion would be much appreciated!

CodePudding user response：

If awk is an option

$ awk '/^H:/ {close(stock_message); stock_message=$2".txt"} {print > stock_message}' input_file

$ cat AAPL.txt
H: AAPL
A: id1, 100
A: id2, 20
E: id1, 80
A: id2, 10
...

$ cat TSLA.txt
H: TSLA
A: id1, 100
E: id1, 20
F: id2, 250
...