I have a text file as follows:
group
time label1
1 1
2 2
3 3
group
time label2
5 8
6 9
7 10
where I want to 1) split the file so that there are just the numbers 2) remove the group, time, and label header info and 3) label each file with the corresponding 'label'.
Here is what I would like the output to look like:
label1.txt
1 1
2 2
3 3
label2.txt
5 8
6 9
7 10
I have been doing this with awk, however, I am running into problems where sometimes files are being overwritten with the wrong file name:
awk '!/group /' text.dat > nogroup.txt #removing 'group'
split -dl 5 --additional-suffix=.txt nogroup.txt split.txt
mv graph.dat00.txt label1.txt
mv graph.dat01.txt label2.txt
awk 'NR!=1' label1.txt > label1.txt
awk 'NR!=1' label2.txt > label2.txt
How can I make sure that the file name (label1) is equal to the label in the file?
Thank you!
CodePudding user response:
$ awk '$1~/^[0-9] $/{print > out; next} {close(out); out=$2".txt"}' file
$ head label?.txt
==> label1.txt <==
1 1
2 2
3 3
==> label2.txt <==
5 8
6 9
7 10
Regarding awk 'NR!=1' label1.txt > label1.txt
in your code - never try to write to the same file you're reading as the shell can empty the output file before the command starts to run so you could end up zapping your input file. Instead do awk 'NR!=1' label1.txt > tmp && mv tmp label1.txt
or similar.