Home > Back-end >  Regular expression in Python to split a string based on characters that begin with @ and end with :?
Regular expression in Python to split a string based on characters that begin with @ and end with :?

Time:12-24

I have strings that look like this:

sentences = "@en:The dog went for a walk@es:El perro fue de paseo" 

Desired output:

splitted = ['The dog went for a walk', 'El perro fue de paseo']

Current code:

splitted = re.split("^@:$", sentences)  

So, id like to split the sentences based on characters beginning with an add symbol @ and ending with a colon : , as these are the way all languages are encoded, e.g. (@en:, @es:, @fr:, @nl: etc.)

CodePudding user response:

You can split on from @ to : without matching any of those chars in between using a negated character class.

There might be empty entries in the result, which you can filter out.

@[^@:]*:

Regex demo

import re
sentences = "@en:The dog went for a walk@es:El perro fue de paseo"
splitted = [s for s in re.split("@[^@:]*:", sentences) if s]

print(splitted)

Output

['The dog went for a walk', 'El perro fue de paseo']

CodePudding user response:

hello try this code it will help you

import re
sentences = "@en:The dog went for a walk@es:El perro fue de paseo" 
splitted = re.split(r"@[a-zA-z] :",sentences)  
print(splitted)

CodePudding user response:

You need this regex : @[^@:] :

first, @ match a @

next, [^@:] match any number of characters (minimum one) that are not @ or :

finally, : match a :

import re
sentences = "@en:The dog went for a walk@es:El perro fue de paseo"
splitted = re.split("@[^@:] :", sentences)
print(splitted[1:])

output:

['The dog went for a walk', 'El perro fue de paseo']
  • Related