I am trying to get a normalized URI from the incoming HTTP Request to print in the logs. This will help us to compute stats & other data by this normalized URI.
To normalize, I'm trying to do String replace using regex on the requestURI:
String str = "/v1/profile/abc13abc/13abc/cDe12/abc-bla/text_tw/HELLO/test/random/2234";
str.replaceAll("/([a-zA-Z]*[\\d|\\-|_] [a-zA-Z]*)|([0-9] )","/x");
This results in
/x/profile/x/x/x/x/x/HELLO/test/random/x
I want to get the result as (do not replace v1
)
/v1/profile/x/x/x/x/x/HELLO/test/random/x
I tried using skip look ahead
String.replaceAll("/(?!v1)([a-zA-Z]*[\d|\-|_] [a-zA-Z]*)|([0-9] )","/x");
But not helping. Any clue is appreciated.
Thanks
CodePudding user response:
Use
/(?:(?!v[1-4])[a-zA-Z]*[0-9_-] [a-zA-Z]*|[0-9] )
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
/ '/'
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
v 'v'
--------------------------------------------------------------------------------
[1-4] any character of: '1' to '4'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
[a-zA-Z]* any character of: 'a' to 'z', 'A' to 'Z'
(0 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
[0-9_-] any character of: '0' to '9', '_', '-'
(1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
[a-zA-Z]* any character of: 'a' to 'z', 'A' to 'Z'
(0 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[0-9] any character of: '0' to '9' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of grouping