I have a giant text file called stock_messages
that looks like this:
H: TSLA
A: id1, 100
E: id1, 20
F: id2, 250
...
H: AAPL
A: id1, 100
A: id2, 20
E: id1, 80
A: id2, 10
...
What I want to do is to create a separate text file with messages for each stock (e.g. AAPL.txt
, TSLA.txt
, etc).
I wrote a bash script so that
start=-1
stock_name=""
grep -n -i '^H' $file | awk -F "[:,]" {'print $1, $NF'} | while read -r line; do
line_number=$(echo $line | awk -F " " {'print $1'})
if [[ "$start" -gt 0 ]]
then
tail -n " start" $file | head -n "$(($line_number-$start))" > "./data/${stock_name}.txt"
echo "saved $stock_name data!"
fi
start=$line_number
stock_name=$(echo $line | awk -F " " {'print $2'})
done
Basically I'm taking the line numbers where H
's are, and using tail
and head
to take those lines out and save it into separate file.
The script runs pretty fast initially but it gets really slow very quickly, and I'm not sure why.
Any suggestion would be much appreciated!
CodePudding user response:
If awk
is an option
$ awk '/^H:/ {close(stock_message); stock_message=$2".txt"} {print > stock_message}' input_file
$ cat AAPL.txt
H: AAPL
A: id1, 100
A: id2, 20
E: id1, 80
A: id2, 10
...
$ cat TSLA.txt
H: TSLA
A: id1, 100
E: id1, 20
F: id2, 250
...