Home > Software engineering >  How do I separate a string by one of the separators but not by all of them?
How do I separate a string by one of the separators but not by all of them?

Time:02-22

She is Mary.  Ella es Maria.
She is Mary."    Ella es Maria."
She is Mary." Ella es Maria."
Hello!.     Hola!.
Hello.!     Hola.!
Hello!     Hola!
How are you?.    Como estas?.
How are you?    Como estas?

How to use one of these as a separator, testing if it can split the string with one of them, and if it can split with one of them then stop testing with the rest .!, .?, ." , ?" , ?. , ? , !" , !. , ! , ." or(not and) .

match = line.split(".",1)
n_sense = match[1].strip()

print(n_sense)

Is there a way to test whether or not I can split it with a conditional? Or should I use regex?

If this works fine print(n_sense) should print at the end:

Ella es Maria.
Ella es Maria."
Ella es Maria."
Hola!.
Hola.!
Hola!
Como estas?.
Como estas?

CodePudding user response:

Try:

import re

tests = [
    "She is Mary.  Ella es Maria.",
    'She is Mary."    Ella es Maria."',
    "Hello!.     Hola!.",
    "Hello.!     Hola.!",
    "Hello!     Hola!",
    "How are you?.    Como estas?.",
    "How are you?    Como estas?",
]

for test in tests:
    x = re.split(r'[.!?][.!?"]?\s ', test)[1]
    print(x)

Prints:

Ella es Maria.
Ella es Maria."
Hola!.
Hola.!
Hola!
Como estas?.
Como estas?
  • Related