I have been thinking about it for a while now and decided to ask for your help. For instance I have a string "abcdefggarfse" or "abcdeefgh" My problem is that I would like to split these string at that point where the letters are doubled. "abcdefggarfse" - > "abcdefg" and "garfse" "abcdeefgh" - > "abcde" and "efgh"
Thanks a lot!
CodePudding user response:
s1="abcdefggarfse"
s2= "abcdeefgh"
s3="abcdefgggarffse"
s4= "abcdeeefgh"
def split_string(string):
tokens = []
base_delimiter = 0
for i in range(len(string) - 1):
if string[i] == string[i 1]:
tokens.append(string[base_delimiter:i 1])
base_delimiter = i 1
tokens.append(string[base_delimiter:])
return tokens
if __name__ == '__main__':
l = split_string(s1)
print(l)
l = split_string(s2)
print(l)
l = split_string(s3)
print(l)
l = split_string(s4)
print(l)
This produces:
['abcdefg', 'garfse']
['abcde', 'efgh']
['abcdefg', 'g', 'garf', 'fse']
['abcde', 'e', 'efgh']
I don't know if it is the expected behaviour for 3 or more repetitions, but this can detect either multiple doubled letters.
CodePudding user response:
Iterate through the String and find the index where the letter repeating.And you can simply use slice operation.
a = "abcdefggarfse"
for i in range(0,len(a) -1):
if a[i] == a[i 1]:
pos = i 1
break
pos1, pos2 = a[0:pos], a[pos:]
OutPut
pos1 = 'abcdefg'
post2 = 'garfse'
CodePudding user response:
This functionality is not available in the in-built split
method so the simplest option will be to use a for
loop to find the double characters and slicing the get the output strings.
Assuming that the string needs to be split exactly once (into two output strings), the following will do the job:
input_ = "abcdefggarfse"
for i in range(len(input_) - 1):
if input_[i] == input_[i 1]:
output1 = input_[:i 1]
output2 = input_[i 1:]
break
print (output1)
print (output2)
Output is:
abcdefg
garfse
You may need to modify the code to put the output into a list, handle multiple splits or strings with no splits, etc.
Another option is using regular expressions but if you have not used them before them the above approach is the simplest.
CodePudding user response:
You can create a function that loops over the word and keeps track of the previous character seen.
def split_rep(word):
prev = None
for idx, char in enumerate(word):
if char == prev:
return word[:idx], word[idx:]
else:
prev = char
return word, None
split_rep("abcdefggarfse")
('abcdefg', 'garfse')
split_rep("abcdeefgh")
('abcde', 'efgh')
split_rep("abcdefgh")
('abcdefgh', None)