How to fetch a number from a file and assign it to a variable-CodePudding

I have a file which contains html response code as shown below:

<d:ChangeType>SSCR</d:ChangeType>
<d:Status>Success</d:Status>
<d:ShortDescription>API </d:ShortDescription>
<d:CycleTypeId>8000006005</d:CycleTypeId>
<d:RfcNumber>1200000910</d:RfcNumber>
<d:ExtrefNumber>API External</d:ExtrefNumber>

Requirement is to fetch number between <d:RfcNumber> and </d:RfcNumber> i.e.1200000910 in this example and feed it to a variable.

I'm trying to use sed like:

sed 's/1200000910.*//' test2.html

but it is not providing me the expected result.

Any help on this will be appreciated.

CodePudding user response：

1st solution: Since you are using sed assuming you are using shell here. You could use awk command here. Simply using awk and setting field delimiter(s) as <d:RfcNumber> and <\\/d:RfcNumber> for all lines. In main program checking if number of fields are greater than 2 then printing the 2nd field.

var=$(awk -F'<d:RfcNumber>|<\\/d:RfcNumber>' 'NF>2{print $2;exit}' Input_file)

2nd solution: Using GNU awk's match function here to get values between tags.

var=$(awk 'match($0,/^<d:RfcNumber>([^<]*)<\/d:RfcNumber>/,arr){print arr[1];exit}' Input_file)

3rd solution: OR with sed please try following code, with using -E option of GNU sed to use ERE(extended regular expression) in code.

var=$(sed -E -n 's/^<d:RfcNumber>([^<]*)<\/d:RfcNumber>/\1/p' Input_file)

CodePudding user response：

...the split way:

#Read your file content
$response = (gc C:\tmp\testdata.txt)
$rfc = ($response -split "<d:RfcNumber>|</d:RfcNumber>")[5]

or regex (but this could probably be optimized):

#Read your file content
$response = (gc C:\tmp\testdata.txt)
($response | select-string "<d:RfcNumber>\d{10}").matches.groups.value -replace "<d:RfcNumber>"

or you use it as xml:

[xml]$xml = '<root>'   (($response -replace "d:") -join $null)   '</root>'
$xml.root.RfcNumber