Home > front end >  What is the proper regex for capturing everything after "String" and between two delimeter
What is the proper regex for capturing everything after "String" and between two delimeter

Time:02-05

Details={
  AwsEc2SecurityGroup={GroupName=m.com-rds, OwnerId=123, VpcId=vpc-123, 
    IpPermissions=[{FromPort=3306, ToPort=3306, IpProtocol=tcp, IpRanges=[{CidrIp=1.1.1.1/32}, {CidrIp=2.2.2.2/32}, {CidrIp=0.0.0.0/0}, {CidrIp=3.3.3.3/32}], 
    UserIdGroupPairs=[{UserId=123, GroupId=sg-123abc}]}], IpPermissionsEgress=[{IpProtocol=-1, IpRanges=[{CidrIp=0.0.0.0/0}]}], GroupId=sg-123abc}}, 
    Region=us-east-1, Id=arn:aws:ec2:us-east-1:123:security-group/sg-123abc}]
}

I want to capture exactly arn:aws:ec2:us-east-1:123:security-group/sg-123abc in this example. Generically, I want to capture the value of Id regardless of placement. My current solution is /Details={.*Id=(.*\w)/, but this only works if it's the last object in the data. How can I take into account the following potential scenario:

Id=arn:aws:ec2:us-east-1:123:security-group/sg-123abc, Thing=123abc}]

CodePudding user response:

You can use look-behind to check that there is the Id= prefix, and then match anything that is not a space, comma or closing brace:

(?<=\bId=)[^,}\s]*

CodePudding user response:

You have a pattern with 2 times .* which will first match till the end of the line/string (depending on if the dot matches a newline) and it will backtrack to match the last occurrence where this part of the pattern Id=(.*\w) can match.

If you want to use a capture group, you can make the format and the allowed characters a bit more specific:

\bId=(\w (?:[:\/-]\w ) )

The pattern in parts

  • \b A word boundary to prevent a partial word match
  • Id= Match literally
  • ( Capture group 1
    • \w Match 1 word chars
    • (?:[:\/-]\w ) Repeat 1 times either : / - and 1 word chars
  • ) Close group 1

Regex demo

Or if you know that it starts with Id=arn:

\bId=(arn:[\w:\/-] )

Regex demo

Note that you don't have to escape the \/ only when the delimiters of the regex are forward slashes, but there is no language tagged.

  •  Tags:  
  • Related