Home > Software design >  How do I read into a .txt and extract a certain string corresponding to a found string?
How do I read into a .txt and extract a certain string corresponding to a found string?

Time:12-10

A folder contains a README.txt and several dicom files named emr_000x.sx (where x are numerical values). In the README.txt are different lines, one of which contains the characters "xyz" and a corresponding emr_000x.sx in the line.

I would like to: read into the .txt, identify which line contains "xyz", and extract the emr_000x.sx from that line only. For reference, the line in the .txt is formatted in this way:

A:emr_000x.sx,  B:00001, C:number, D(characters)string_string_number_**xyz**_number_number

I think using grep might be helpful, but am not familiar enough to bash coding myself. Does anyone know how to solve this? Many thanks!

CodePudding user response:

You can use awk to match fields on you csv:

awk -F, '$4 ~ "xyz" {sub(/^A:/, "", $1); print $1}'

CodePudding user response:

I like sed for this sort of thing.

 sed -nE '/xyz/{ s/^.*A:([^,] ),.*/\1/; p; }' README.txt

This says, "On lines where you see xyz replace the whole line with the non-commas between A: and a comma, then print the line."

-n is no printing unless I say so. (p means print.) -E just means to use Extended regexes.

/xyz/{...} means "on lines where you see xyz do the stuff between the curlies."
s/^.*A:([^,] ),.*/\1/ will substitute the matched part (which should be the whole line) with just the part between the parens.

  • Related