Home > Back-end >  How do I remove substrings from a string if repeated more than once?
How do I remove substrings from a string if repeated more than once?

Time:11-08

I am writing a program where the user enters certain substrings, and the code will concatenate them based on overlapping characters. The program works perfectly fine for two strings, however, when I enter three strings, it overlaps two substrings in the string. Here is my code: `

y = int(input("How many strings do you want to enter?: "))
String_List = []
final_str = ""
for i in range(y):
    x = input("Enter a string: ")
    String_List.append(x)   
for o in String_List:
    for p in String_List:
        if o != p:
            if o in p:
                if p not in final_str:
                    final_str  = str(p)
            elif p in o: 
                if o not in final_str:
                    final_str  = str(o)
            for n in range(2,len(p)):
                if o[-n:] == p[:n]:
                    if o and p not in final_str:
                        p = p[n:]
                        final_str  = str(o p)
            for j in range(2,len(p)):
                if o[:j] == p[-j:]:
                    if o and p not in final_str:
                        o = o[j:]
                        final_str  = str(p o)
            else:
                continue
        else: 
            continue
print(final_str)

`

To better explain this problem, I entered three substrings, A1B2, 1B2C3, C3D4E5. Here is the output I got: A1B2C31B2C3D4E5 The bold area is the repeat that I don't want.

CodePudding user response:

Assuming you want to join successive strings on a common start/end.

merging on the shortest (non-null) match:

strings = ['A1B2', '1B2C3', 'C3D4E5']

out = strings[0]

for s in strings[1:]:
    for i in range(1, min(len(out), len(s))):
        if out[-i:] == s[:i]:
            out  = s[i:]
            break

Output:

A1B2C3D4E5

merging on the longest match:

strings = ['A1B2', '1B2C3C3', 'C3C3D4E5'] # note the duplicated C3C3

out = strings[0]

for s in strings[1:]:
    for i in range(min(len(out), len(s)), 1, -1):
        if out[-i:] == s[:i]:
            out  = s[i:]
            break

Output: A1B2C3C3D4E5

handling non matches

I'm adding a space for clarity

strings = ['A1B2', '1B2C3', 'C3D4E5', 'F6G7', 'G7H8']

out = strings[0]

for s in strings[1:]:
    for i in range(min(len(out), len(s)), 1, -1):
        if out[-i:] == s[:i]:
            out  = s[i:]
            break
    else:
         out  = ' '   s

Output: A1B2C3D4E5 F6G7H8

CodePudding user response:

Your code is so much complex to understand... I understand your goal you can solve this solution even with simpler solution.

import re
n= input("how many ")
fs = ''
for i in range(int(n)):
    s= input(f"enter string{i 1} : ")
    fs  = s
    

print(re.sub(r"(. ?)\1 ", r"\1", fs))

Obsorvations


input:

how many 3

enter string 1: A1B2
enter string 2: 1B2C3
enter string 3: C3D4E5

output

A1B2C3D4E5
  • Related