Home > Mobile >  how to safely extract header information and make the name of the file with awk?
how to safely extract header information and make the name of the file with awk?

Time:06-01

I have a text file as follows:

group 
time    label1
1         1
2         2
3         3
group 
time   label2
5        8
6        9
7        10

where I want to 1) split the file so that there are just the numbers 2) remove the group, time, and label header info and 3) label each file with the corresponding 'label'.

Here is what I would like the output to look like:

label1.txt

1         1
2         2
3         3

label2.txt

5        8
6        9
7        10

I have been doing this with awk, however, I am running into problems where sometimes files are being overwritten with the wrong file name:

awk '!/group /' text.dat > nogroup.txt #removing 'group'
 
split -dl 5 --additional-suffix=.txt nogroup.txt split.txt
 
mv graph.dat00.txt label1.txt
mv graph.dat01.txt label2.txt
  
awk 'NR!=1' label1.txt > label1.txt
awk 'NR!=1' label2.txt > label2.txt

How can I make sure that the file name (label1) is equal to the label in the file?

Thank you!

CodePudding user response:

 $ awk '$1~/^[0-9] $/{print > out; next} {close(out); out=$2".txt"}' file

$ head label?.txt
==> label1.txt <==
1         1
2         2
3         3

==> label2.txt <==
5        8
6        9
7        10

Regarding awk 'NR!=1' label1.txt > label1.txt in your code - never try to write to the same file you're reading as the shell can empty the output file before the command starts to run so you could end up zapping your input file. Instead do awk 'NR!=1' label1.txt > tmp && mv tmp label1.txt or similar.

  • Related