Home > Software design >  How to insert a column at the start of a txt file using awk?
How to insert a column at the start of a txt file using awk?

Time:10-14

How to insert a column at the start of a txt file running from 1 to 2059 which corresponds to the number of rows I have in my file using awk. I know the command will be something like this:

awk '{$1=" "}1' File

Not sure what to put between the speech-marks 1-2059?

I also want to include a header in the header row so 1 should only go in the second row technically.

**ID**     Heading1
RQ1293939    -7.0494
RG293I32SJ   -903.6868
RQ19238983   -0899977
rq747585950   988349303
FID  **ID**     Heading1
1    RQ1293939    -7.0494
2    RG293I32SJ   -903.6868
3    RQ19238983   -0899977
4    rq747585950   988349303

So I need to insert the FID with 1 - 2059 running down the first column

CodePudding user response:

What you show does not work, it just replaces the first field ($1) with a space and prints the result. If you do not have empty lines try:

awk 'NR==1 {print "FID\t" $0; next} {print NR-1 "\t" $0}' File

Explanations:

  • NR is the awk variable that counts the records (the lines, in our case), starting from 1. So NR==1 is a condition that holds only when awk processes the first line. In this case the action block says to print FID, a tab (\t), the original line ($0), and then move to next line.
  • The second action block is executed only if the first one has not been executed (due to the final next statement). It prints NR-1, that is the line number minus one, a tab, and the original line.

If you have empty lines and you want to skip them we will need a counter variable to keep track of the current non-empty line number:

awk 'NR==1 {print "FID\t" $0; next} NF==0 {print; next} {print   cnt "\t" $0}' File

Explanations:

  • NF is the awk variable that counts the fields in a record (the space-separated words, in our case). So NF==0 is a condition that holds only on empty lines (or lines that contain only spaces). In this case the action block says to print the empty line and move to the next.
  • The last action block is executed only if none of the two others have been executed (due to their final next statement). It increments the cnt variable, prints it, prints a tab, and prints the original line.
  • Uninitialized awk variables (like cnt in our example) take value 0 when they are used for the first time as a number. cnt increments variable cnt before its value is used by the print command. So the first time this block is executed cnt takes value 1 before being printed. Note that cnt would increment after the printing.

CodePudding user response:

Assuming you don't really have a blank row between your header line and the rest of your data:

awk '{print (NR>1 ? NR-1 : "FID"), $0}' file

Use awk -v OFS='\t' '...' file if you want the output to be tab-separated or pipe it to column -t if you want it visually tabular.

  • Related