Home > Mobile >  How can I append a string on a specific column on the lines that match a condition on a txt file usi
How can I append a string on a specific column on the lines that match a condition on a txt file usi

Time:10-29

I have a text file with a bunch of serial numbers and they're supposed to be 16 characters long. But some of the records were damaged and are 13 characters long. I want to add 3 zeros at the beginning of every record that has 13 characters long.

Note: The serial numbers doesn't start at the beginning of the line, they all start at the column 15 of every line.

My file currently looks like this:

1:6822:26: :A:0000000999993: :DIS:14516E : :01: : : ::0529483733710: : :
1:6822:26: :A:0000000999994: :MAT:13L324 : :01: : : :: : : :
1:6822:26: :A:0000000999995: :CAT:P13WFB : :01: : : ::0529483697940: : :
1:6822:26: :3:0000000000170891: :AZDG-2 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170892: :AZDG-3 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170893: :AZDG-4 :0000003999:01:0000000000: : :: : : :

And the output should be:

1:6822:26: :A:0000000000999993: :DIS:14516E : :01: : : ::0529483733710: : :
1:6822:26: :A:0000000000999994: :MAT:13L324 : :01: : : :: : : :
1:6822:26: :A:0000000000999995: :CAT:P13WFB : :01: : : ::0529483697940: : :
1:6822:26: :3:0000000000170891: :AZDG-2 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170892: :AZDG-3 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170893: :AZDG-4 :0000003999:01:0000000000: : :: : : :

This is the code I made to get the records that are shortened:

    #!/bin/bash
    i=1
    for OUTPUT in $*(cut -c15-30 file.txt)
    do
       if [[ ${#OUTPUT} == 13 ]]
       then 
              echo $OUTPUT
              echo $i
              i=$((i 1))
    
       fi
    done

The txt file has more than 50,000 records so I can't change them manually.

CodePudding user response:

This sed one-liner should do the job:

sed 's/^\(.\{14\}\)\([0-9]\{13\}[^0-9]\)/\1000\2/' file

This assumes serial numbers consist of decimal digits only and trusts that they all start at the column 15 of every line.

Or, an awk solution:

awk 'BEGIN { FS=OFS=":" } length($6) == 13 { $6 = "000" $6 } 1 ' file

This one only checks if the length of the sixth field is 13 and trusts that sixth field is the serial number field.

CodePudding user response:

One awk idea that replaces all of OP's current code:

awk '
BEGIN         { FS=OFS=":" }                # set input/output field delimiter to ":"
length($6)<16 { $6=sprintf("6d",$6) }    # if length of 6th field < 16 then left-pad the field with 0's to length of 16
1                                           # print current line
' file.txt

This generates:

1:6822:26: :A:0000000000999993:DIS:14516E : :01: : : ::0529483733710: : :
1:6822:26: :A:0000000000999994:MAT:13L324 : :01: : : :: : : :
1:6822:26: :A:0000000000999995:CAT:P13WFB : :01: : : ::0529483697940: : :
1:6822:26: :3:0000000000170891: :AZDG-2 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170892: :AZDG-3 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170893: :AZDG-4 :0000003999:01:0000000000: : :: : : :

CodePudding user response:

I took the liberty to tack a : on ...

$ awk '{if(length($2)<19){$2=gensub(/^(:.:)/,"\\1000","1",$2)":"}}1' file.txt 
1:6822:26: :A:0000000000999993: :DIS:14516E : :01: : : ::0529483733710: : :
1:6822:26: :A:0000000000999994: :MAT:13L324 : :01: : : :: : : :
1:6822:26: :A:0000000000999995: :CAT:P13WFB : :01: : : ::0529483697940: : :
1:6822:26: :3:0000000000170891: :AZDG-2 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170892: :AZDG-3 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170893: :AZDG-4 :0000003999:01:0000000000: : :: : : :

If that's not what you want, use this: awk '{if(length($2)<19){$2=gensub(/^(:.:)/,"\\1000","1",$2)}}1' file.txt

CodePudding user response:

Another alternative

awk -v{O,}FS=: '{$6=gensub(" ", "0", "g", sprintf("s", gensub(" ", "", "g", $6)))}1'

result

1:6822:26: :A:0000000000999993:DIS:14516E : :01: : : ::0529483733710: : :
1:6822:26: :A:0000000000999994:MAT:13L324 : :01: : : :: : : :
1:6822:26: :A:0000000000999995:CAT:P13WFB : :01: : : ::0529483697940: : :
1:6822:26: :3:0000000000170891: :AZDG-2 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170892: :AZDG-3 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170893: :AZDG-4 :0000003999:01:0000000000: : :: : : :

CodePudding user response:

Because your question it tagged with bash. As an object of study.

# init array arr
arr=();

# read current row with field separator : from file to array arr
while IFS=":" read -r -a arr rest; do

  # remove leading zeros to avoid problem with octal numbers in bash
  # and then pad leading zeros
  printf -v arr[5]  "6d" "${arr[5]## (0)}";

  # output array arr with field separator :
  for i in "${arr[@]}"; do
    printf '%s:' "$i";
  done;
  printf '\n';

done < file

Output:

1:6822:26: :A:0000000000999993: :DIS:14516E : :01: : : ::0529483733710: : :
1:6822:26: :A:0000000000999994: :MAT:13L324 : :01: : : :: : : :
1:6822:26: :A:0000000000999995: :CAT:P13WFB : :01: : : ::0529483697940: : :
1:6822:26: :3:0000000000170891: :AZDG-2 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170892: :AZDG-3 :0000003999:01:0000000000: : :: : : :
1:6822:26: :3:0000000000170893: :AZDG-4 :0000003999:01:0000000000: : :: : : :

The tool of choice is certainly awk.

  • Related