I need to write a regex pattern that will remove everything from my text except letters, hyphen (-)
, slash (/)
(e.g., '[^a-zA-Z-/]'
) and numbers in combination with a hyphen ('5-'
, '-123'
). Single numbers or numbers in combination with other characters should be removed, so '9-SomeWord'
, 'SomeWord-34'
must be kept, but '456ml'
, '23'
or '56%'
should be removed.
What should be the regex pattern?
CodePudding user response:
Try
r'[^\w/-] |_|(?<![\d-])\d (?!\d*-)'
See regex101 for testing and further details.