Home > database >  Bash finding when specific condition is true using sed with different delimiter
Bash finding when specific condition is true using sed with different delimiter

Time:11-23

Similar to what Bash replace string where specific condition is true asked I want to replace part of a line based on a prior condition but my delimiter must be different because part of what I am replacing includes /. The condition is also not in the first column but instead the second.

For example my data includes:

Location Ref Alt GT1 GT2
1_100004338 T C 0/0 0/0
1_100004339 C T 0/0 0/1
1_100004343 A G 1/1 0/0

If I want to base it on if I have a C in Ref (column 2) and replace all occurrences of 0/0 with 2:

Location Ref Alt GT1 GT2
1_100004338 T C 0/0 0/0
1_100004339 C T 2 0/1
1_100004343 A G 1/1 0/0

I have tried the following input

sed " ^"C" s "0/0" "2" g" file

and get the error

sed: -e expression #1, char 2: unknown command: `^'

I am not sure if it would have even gave me what I wanted if it worked though as C is in the second column and not the beginning of the line. I have tried using other ways like awk BEGIN which is way to slow for how big my file is.

Any help would be appreciated and thanks in advance.

CodePudding user response:

This might work for you (GNU sed ):

sed '/^\S\  C /s#0/0#2#g' file

If the second column is C replace all occurrences of 0/0 with 2.

CodePudding user response:

awk would be more suitable for this.

$ awk '$2=="C" {$4=2}1' input_file
Location Ref Alt GT1 GT2
1_100004338 T C 0/0 0/0
1_100004339 C T 2 0/1
1_100004343 A G 1/1 0/0

If $2 column 2 string is C, then $4 column 4 equals 2.

If sed is a must, you can try this.

$ sed '/[^ ]* C [A-Z]/ {s|0/0|2|}' input_file
Location Ref Alt GT1 GT2
1_100004338 T C 0/0 0/0
1_100004339 C T 2 0/1
1_100004343 A G 1/1 0/0
  • Related