Home > front end >  How to match or not match several prefixes?
How to match or not match several prefixes?

Time:12-16

I have the following data (a subset of possible log4j responders if someone is interested)

ap://167.172.44.255:1389/LegitimateJavaCla
ap://167.172.44.255:1389/La
ap://167.99.32.139:1389/Basic/ReverseShell/167.99.32.139/99
ldap://x.x.x.x.61k2ev3252274o2ek77941q85t0r9444o.interact.sh/ok6ll9m
ldap://c6ps4rekeidcvgqlsmsgcg37qdoyyknz4.interact.sh/a
ldap://c6ps4rekeidcvgqlsmsgcg37x9ayymcak.interact.sh/a
ldap://c6ps4ipurnhssm2608l0cg37chyyykyhk.interact.sh/a
ldap://c6ps4ipurnhssm2608l0cg37pdyyykbug.interact.sh/a
91fd9fef8958.bingsearchlib.com:39356/
550f7e1deaed.bingsearchlib.com:39356/a
2174d47e8d04.bingsearchlib.com:39356/a
da6d408517b9.bingsearchlib.com:39356/a
5463610592ef.bingsearchlib.com:39356/a

I would like to keep the FQDN only (the host and domain) or the IP - so I tried (\S*)?(:\/\/)?(?<interesting>.*)(:)?\/ (see https://regex101.com/r/dusRR5/1)

The idea was:

  • (\S*)? → match or not some letters (ldap, ...)
  • (:\/\/)? → match or not ://
  • (?<interesting>.*) → match anything and call it interesting
  • (:)? → ... but stop at : if there is one
  • \/ → ... otherwise stop at /

The expected result is

167.172.44.255
167.99.32.139
x.x.x.x.61k2ev3252274o2ek77941q85t0r9444o.interact.sh
c6ps4rekeidcvgqlsmsgcg37qdoyyknz4.interact.sh
c6ps4rekeidcvgqlsmsgcg37x9ayymcak.interact.sh
(...)

But it does not work and my very limited knowledge of regex does not help.

CodePudding user response:

Modified a bit:

^((?:\S*:\/\/)?\S*?)[:\/]

The capturing group contains what you are interested in. The key is to use the lazy approach (*?) along with the start line anchor (^).

Demo

CodePudding user response:

You can use

^(?:[a-zA-Z0-9] :\/\/)?(?<interesting>[^:\/] )

See the regex demo. Details:

  • ^ - start of string
  • (?:[a-zA-Z0-9] :\/\/)? - an optional occurrence of any one or more letters/digits and then ://
  • (?<interesting>[^:\/] ) - Group "interesting": any one or more chars other than : and /.

Remember that you do not have to escape / if you define your regex with a string literal (as in Python, or C#, or using constructor notations in JavaScript/Ruby/etc.).

  • Related