Want to remove everything except # NewLine, complete bracket set and numbers inside braces.
Sample input:
# (1296) {20} [529] [1496] [411]
# (MONDAY ) (1296)
# (646) {20} (BEACH 7) [20 Mtrs] { 03 Foot }
# {19} [455] [721] (1296) (SUNDAY ) [2741] (MONDAY (WEDNESDAY {20}
# {19} (1296)
Code which does not work:
$re = '/(?:\[[^][]*]|\([^()]*\)|{[^{}]*})(*SKIP)(*F)|[^][(){}@#] /m';
$result = preg_replace($re, '', $input);
Incorrect output:
#(1296){20}[529][1496][411]
#(1296)
#(646){20}(BEACH 7)[20 Mtrs]{ 03 Foot }
#{19}[455][721](1296)[2741](({20}
#{19}(1296)
Desired output:
#(1296) {20} [529] [1496] [411]
#(1296)
#(646) {20}
#{19} [455] [721] (1296) [2741] {20}
#{19} (1296)
CodePudding user response:
You could match at least 1 digit between the brackets and then skip that match.
Then match any char except a newline or #
to be replaced with an empty string.
(?:\[\h*\d[\h\d]*]|\(\h*\d[\h\d]*\)|{\h*\d[\h\d]*})\h*(*SKIP)(*F)|[^#\n]
Explanation
(?:
Non capture group\[\h*\d[\h\d]*]
Match at least 1 digit between square brackets, where\h
matches horizontal whitespace characters (no newlines)|
Or\(\h*\d[\h\d]*\)
1 digit between parenthesis|
Or{\h*\d[\h\d]*}
1 digit between curly braces
)\h*
Close the non capture group and match 1 spaces(*SKIP)(*F)
Skip and fail the match (to leave it untouched in the output)|
Or[^#\n]
Match any character except#
or a newline
CodePudding user response:
You may match using this regex:
(?:(\()|({)|\[)[\h\d]* ([^])}\s\d])(?(1)[^()]*\)|(?(2)[^{}]*}|[^][]*]))\h*|(?<=#)\h |\([^\s)] \h
and replace with an empty string.
RegEx Details:
(?:(\()|({)|\[)[\h\d]* ([^])}\s\d])(?(1)[^()]*\)|(?(2)[^{}]*}|[^][]*]))
: Match(...)
or{...}
or[...]
if they contain at least one non-digit\h*
: Match 0 or more whitespace|
: OR(?<=#)\h
: Match 1 whitespaces after#
|
: OR\([^\s)] \h
: Match(
and 1 of non-whitespace text followed by 1 whitespaces