Home > Software design >  Java String replace all using regex with lookahead
Java String replace all using regex with lookahead

Time:02-17

I am trying to get a normalized URI from the incoming HTTP Request to print in the logs. This will help us to compute stats & other data by this normalized URI.

To normalize, I'm trying to do String replace using regex on the requestURI:

String str = "/v1/profile/abc13abc/13abc/cDe12/abc-bla/text_tw/HELLO/test/random/2234";
str.replaceAll("/([a-zA-Z]*[\\d|\\-|_] [a-zA-Z]*)|([0-9] )","/x");

This results in

/x/profile/x/x/x/x/x/HELLO/test/random/x

I want to get the result as (do not replace v1)

/v1/profile/x/x/x/x/x/HELLO/test/random/x

I tried using skip look ahead

String.replaceAll("/(?!v1)([a-zA-Z]*[\d|\-|_] [a-zA-Z]*)|([0-9] )","/x");

But not helping. Any clue is appreciated.

Thanks

CodePudding user response:

Use

/(?:(?!v[1-4])[a-zA-Z]*[0-9_-] [a-zA-Z]*|[0-9] )

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  /                        '/'
--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      v                        'v'
--------------------------------------------------------------------------------
      [1-4]                    any character of: '1' to '4'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
    [a-zA-Z]*                any character of: 'a' to 'z', 'A' to 'Z'
                             (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    [0-9_-]                  any character of: '0' to '9', '_', '-'
                             (1 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    [a-zA-Z]*                any character of: 'a' to 'z', 'A' to 'Z'
                             (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    [0-9]                    any character of: '0' to '9' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of grouping
  • Related