I have a string that could contain letters, numbers, special characters, or a pattern (such as */my-variable/*
).
I want to basically ignore the special characters and any patterns (could be multiple), and take the remaining text to ensure it's usable text. My goal is to look at this string and mark it as valid or not, so long as it has some normal words, it's fine, but if it's only patterns and special characters, it's not.
This is for PHP (if that's necessary information). I wanted to avoid doing multiple preg_replace
and try to be efficient and keep it one line: return me the alpha-numeric characters I'm looking for.
Here's an example string
Thank You!1!11 | )(^%& */person-first_name/* For Being Awesome */person-c235/* - Number 39658!? $450 | And Some moretextstuff
The regex I've got so far
[\s\w\d]{1,}|(\*\/[^\/\*]*\/\*)
I'm using regex101.com. It's doing some decent matches, but I can't figure out how to exclude the patterns. I probably shouldn't have that |
"or" in there. If necessary, may have to add another exclusion group for special characters, but seems like they're ignored well with the [\s\w\d]
part.
CodePudding user response:
Using php, you might use a pattern like this to exclude the pattern */..../*
or non word characters except whitespace chars:
(?:\*/.*?/\*|[^\w\s] )(*SKIP)(*F)|\w
The pattern in parts:
(?:
Non capture group for the alternatives\*/.*?/\*
Match from*/../*
non greedy to stop at the first occurrence|
Or[^\w\s]
Match one or more non word characters excluding whitespace chars
)
Close the non capture group(*SKIP)(*F)
Skip the match|
Or\w
Match 1 or more word characters