Replace only domain part with awk in the second column-CodePudding

when this lines are given in a file:

docker://docker.io/some-repo/my-image:v1.1.2 docker://docker.io/some-repo/my-image:v1.1.2
docker://ghcr.io/some-repo/my-image:v1.1.2 docker://ghcr.io/some-repo/my-image:v1.1.2
docker://ecr.aws/some-repo/my-image:v1.1.2 docker://ecr.aws/some-repo/my-image:v1.1.2

if i execute

awk -F: -v var2="example.local/" 'BEGIN{FS=OFS=" "} {gsub(/(\/\/.*\.[a-z{2,6}]*\/)/, var2, $2)} 1'

results in

docker://docker.io/some-repo/my-image:v1.1.2 docker:example.local/some-repo/my-image:v1.1.2
docker://ghcr.io/some-repo/my-image:v1.1.2 docker:example.local/some-repo/my-image:v1.1.2
docker://ecr.aws/some-repo/my-image:v1.1.2 docker:example.local/some-repo/my-image:v1.1.2

How can i only replace the domain part in the 2nd column? In this case "docker.io|ghcr.io|ecr.aws" in a generic way.

CodePudding user response：

With your shown samples please try following GNU awk solution. In case // in output before new value is needed then change regex to ^(.*docker:\/\/)[^/]*\/(.*) in both the solutions.

awk -v var="example.local/" '
match($0,/^(.*docker:)\/\/[^/]*\/(.*)/,arr){
  print arr[1] var arr[2]
}
'  Input_file

Explanation: Adding detailed explanation for above awk code. Using GNU awk's match function here. Where I am using regex ^(.*docker:)\/\/[^/]*\/(.*) to do the matches. This expression is creating 2 capturing groups(1st value from starting of line to till docker:// AND 2nd after 1st occurrence of / which comes after docker.io|ghcr.io|ecr.aws values to till end of the line). As per GNU awk's match function it creates an array(named arr here) which stores values of capturing groups in to it. Once match condition is TRUE then simply printing arr's 1st value followed by var(new value) followed by arr's 2nd value.

Bonus solution: In case you are ok with perl one-liner solution then you may try following, using lazy matching concept in its regex to get the exact match, creating 2 capturing groups and then substituting it with only required parts only.

perl -pe 's/^(.*?\s docker:)\/\/.*?\/(.*)$/$1\/example.local\/$2/' Input_file

CodePudding user response：

you might try:

awk -v domain2=example.com '{sub("://[^/]*", "://" domain2, $2)} 1'

CodePudding user response：

For the example data, perhaps a pattern without matching // would be acceptable.

As you replace only 1 occurrence, you might use sub to match the first occurrence of the pattern.

For example

awk -v var2="example.local/" '{sub(/[^./] \.[a-z]{2,6}\//, var2, $2)}1' file

The pattern matches:

[^./] Match 1 chars other than / or .
\.[a-z]{2,6} Match a dot and 2-6 times a char a-z
\/ Match /

Output

docker://docker.io/some-repo/my-image:v1.1.2 docker://example.local/some-repo/my-image:v1.1.2
docker://ghcr.io/some-repo/my-image:v1.1.2 docker://example.local/some-repo/my-image:v1.1.2
docker://ecr.aws/some-repo/my-image:v1.1.2 docker://example.local/some-repo/my-image:v1.1.2

Else you can match the leading docker:// at the start of the string, and use that in the replacement:

awk -v var2="example.local/" '{sub(/^docker:\/\/[^./] \.[a-z]{2,6}\//, "docker://" var2, $2)}1' file