How to extract the front and back of a designated special token(in this case, -, not @)? And if those that are connected by - are more than two, I want to extract those too. (In the example, Bill-Gates-Foundation)
e.g) from 'Meinda@Bill-Gates-Foundation@drug-delivery' -> ['Bill-Gates-Foundation', 'drug-delivery']
I tried p = re.compile('@(\D )\*(\D )')
but that was not what I wanted.
CodePudding user response:
You can exclude matchting the @
char and repeat 1 or more times the -
@([^\s@-] (?:-[^\s@-] ) )
Explanation
@
Match literally(
Capture group 1 (returned by re.findall)[^\s@-]
Match 1 non whitespace chars except-
and@
(?:-[^\s@-] )
Repeat 1 times matching-
and again 1 non whitespace chars except-
and@
)
Close group 1
import re
pattern = r"@([^\s@-] (?:-[^\s@-] ) )"
s = r"Meinda@Bill-Gates-Foundation@drug-delivery"
print(re.findall(pattern, s))
Output
['Bill-Gates-Foundation', 'drug-delivery']
CodePudding user response:
To extract the front and back of a designated special token (in this case, -, not @), you can use a regular expression with the re module.
Here is an example of how you can use a regular expression to extract the front and back of the - token in a given string:
import re
# The input string
string = 'Meinda@Bill-Gates-Foundation@drug-delivery'
# Use a regular expression to extract the front and back of the '-' token
p = re.compile(r'@([\w-] )@([\w-] )')
matches = p.findall(string=string)
# Print the matches
print(matches)
This code will print the following output:
[('Bill-Gates-Foundation', 'drug-delivery')]
CodePudding user response:
@ahmet-buğra-buĞa gave an answer with regex.
If you don't have to use regex, then it is easier way is to just use split.
test_str = "Meinda@Bill-Gates-Foundation@drug-delivery"
test_str.split("@")[1:]
This outputs
['Bill-Gates-Foundation', 'drug-delivery']
You can make it a function like so
def get_list_of_strings_after_first(original_str, token_to_split_on):
return original_str.split("@")[1:]
get_list_of_strings_after_first("Meinda@Bill-Gates-Foundation@drug-delivery", "@")
This give the same output
['Bill-Gates-Foundation', 'drug-delivery']