Missing character in sed grouping-CodePudding

I am trying to get an IP from a string by using sed grouping and I can't figure out what is wrong with this.

input:

echo "instream(10.20.213.11@40266):" | sed -E 's/.*([0-9] \.[0-9] \.[0-9] \.[0-9] ).*/\1/'

output:

0.20.213.11

Why is the first number missing?

CodePudding user response：

You can replace .* that matches greedily to a .*[^0-9] requiring a non-digit char before the subsequent digit matching regex part:

sed -E 's/.*[^0-9]([0-9] \.[0-9] \.[0-9] \.[0-9] ).*/\1/'

It will work since IP addresses in your string are not at the start of the string.

See the online demo:

#!/bin/bash
echo "instream(10.20.213.11@40266):" | \
  sed -E 's/.*[^0-9]([0-9] \.[0-9] \.[0-9] \.[0-9] ).*/\1/'
# => 10.20.213.11

If your IP addresses can be at the string start, you can use

sed -E 's/(.*[^0-9]|^)([0-9] (\.[0-9] ){3}).*/\2/'

See this online demo. Here, (.*[^0-9]|^) matches either any text up to the right-most non-digit or start of string. Now, the IP address matching pattern will land in Group 2, hence is the use of \2 in the replacement (RHS).

If your sed supports word boundaries, consider using word boundaries:

sed -E 's/.*\b([0-9] (\.[0-9] ){3})\b.*/\1/'

See this online demo, too.