How do I read into a .txt and extract a certain string corresponding to a found string?-CodePudding

A folder contains a README.txt and several dicom files named emr_000x.sx (where x are numerical values). In the README.txt are different lines, one of which contains the characters "xyz" and a corresponding emr_000x.sx in the line.

I would like to: read into the .txt, identify which line contains "xyz", and extract the emr_000x.sx from that line only. For reference, the line in the .txt is formatted in this way:

A:emr_000x.sx,  B:00001, C:number, D(characters)string_string_number_**xyz**_number_number

I think using grep might be helpful, but am not familiar enough to bash coding myself. Does anyone know how to solve this? Many thanks!

CodePudding user response：

You can use awk to match fields on you csv:

awk -F, '$4 ~ "xyz" {sub(/^A:/, "", $1); print $1}'

CodePudding user response：

I like sed for this sort of thing.

 sed -nE '/xyz/{ s/^.*A:([^,] ),.*/\1/; p; }' README.txt

This says, "On lines where you see xyz replace the whole line with the non-commas between A: and a comma, then print the line."

-n is no printing unless I say so. (p means print.) -E just means to use Extended regexes.

/xyz/{...} means "on lines where you see xyz do the stuff between the curlies."
s/^.*A:([^,] ),.*/\1/ will substitute the matched part (which should be the whole line) with just the part between the parens.