I am trying to write python script to match a regex that can include everything which has two -
and one .
but I also want to exclude two strings from it. They are NIST-Privacy-v1.1
and NIST-CSF-v1.1
Here is my sample data:
NIST-Privacy-v1.1
NIST-CSF-v1.1
AWS-CIS-v1.4-1.10
SOC-2-CC6.8
NIST-800-53rev5
NIST-800-53rev5-CM-3(1)
NIST-800-53rev5-AU-12(4)
SOC-2-CC6.1
NISTPrivacyFramework-v1.0
NISTPrivacyFramework-v1.0-PR.AC-P1
NIST-800-53rev5-AC-1
NIST-800-53rev5-IA-1
I started with a very simple regex which does the job of matching what I need but doesn't exclude the two strings. Can you help me identify the exclusion part.
regex:
.*-.*-.*[.|\-].*
Desired output:
AWS-CIS-v1.4-1.10
SOC-2-CC6.8
NIST-800-53rev5-CM-3(1)
NIST-800-53rev5-AU-12(4)
SOC-2-CC6.1
NISTPrivacyFramework-v1.0-PR.AC-P1
NIST-800-53rev5-AC-1
NIST-800-53rev5-IA-1
CodePudding user response:
^(?!NIST-Privacy-v1\.1)(?!NIST-CSF-v1\.1).*-.*-.*[.-].*$
Output:
AWS-CIS-v1.4-1.10
SOC-2-CC6.8
NIST-800-53rev5-CM-3(1)
NIST-800-53rev5-AU-12(4)
SOC-2-CC6.1
NISTPrivacyFramework-v1.0-PR.AC-P1
NIST-800-53rev5-AC-1
NIST-800-53rev5-IA-1
Demo: https://regex101.com/r/gM9e44/1
^
=> Given pattern must start from the beginning of the line[.-]
=> "-" or "."^(?!NIST-Privacy-v1\.1)
=> It must not start with "NIST-Privacy-v1.1"^(?!NIST-Privacy-v1\.1)(?!NIST-CSF-v1\.1)
=> It must not start with "NIST-Privacy-v1.1" or "NIST-CSF-v1.1"$
=> Given pattern must finish at the end of the line
CodePudding user response:
You may use this regex for your job:
^(?!NIST-(?:CSF|Privacy)-v1\.1$)(?:[^-]*-){2}.*[.-].*
RegEx Breakup:
^
: Start(?!NIST-(?:CSF|Privacy)-v1\.1,$)
: Negative lookahead to fail to match when input isNIST-Privacy-v1.1
orNIST-CSF-v1.1
(?:[^-]*-){2}
: Match 0 or more of non-hyphen characters followed by a hyphen. Repeat this group 2 times.*[.-]
: Match any text followed by dot or hyphen.*
: Match 0 or more of any text