I have a string that looks like this :
my_sting="AC=1;AN=249706;AF=4.00471e-06;rf_tp_probability=8.55653e-01;"
it is based on a column in my data :
REF ALT QUAL FILTER INFO
1 C A 3817.77 PASS AN=2;AF=4.00471e06;rf_tp_probability=8.55653
2 C G 3817.77 PASS AN=3;AF=5;rf_tp_probability=8.55653
i wish to select only the part that start with AF= and ends with the number AF is equal to . for example here: AF=4.00471e-06
I tried this :
print(str_extract_all(my_sting, "AF=. ;"))
[[1]]
[1] "AF=4.00471e-06;rf_tp_probability=8.55653e-01;"
but it returned everything until the end. instead of returning AF=4.00471e-06 is there any way to fix this ? thank you
CodePudding user response:
You can write the pattern using a negated character class [^;]
as:
library(stringr)
my_sting="AC=1;AN=249706;AF=4.00471e-06;rf_tp_probability=8.55653e-01;"
print(str_extract_all(my_sting, "AF=[^;] "))
Output
[[1]]
[1] "AF=4.00471e-06"
CodePudding user response:
Another option. Use "followed by ;" (i.e., (?=;)
)
my_sting="AC=1;AN=249706;AF=4.00471e-06;rf_tp_probability=8.55653e-01;"
str_extract(my_sting, "AF=.*?(?=;)")
#> [1] "AF=4.00471e-06"