So I have a few thousand strings in a format like this:
"something - something else (another thing) [even more things]"
And I need to remove the parenthesis and square brackets but my problem is that any other part of the string could contain square brackets/parenthesis too (there could also be more square brackets/parenthesis in the square brackets, but the parenthesis could only contain square brackets and not more parenthesis) and the number of spaces is also different for each string. The only thing that is constant is that the square brackets/parenthesis I want to remove are always at the end of the string. How would I remove these without changing anything else in the string to get the output string:
"something - something else"
Edit: Just to clarify the length of the string and the number of words can always be different, it's just always the same "shape", basically it's:
"some unknown string" "-" "some unknown string" "(some unknown string)" "[some unknown string]"
CodePudding user response:
Do you specifically need to be able to find and cut it at those brackets? Or is it a case of being able to simply slice away the end of the string? If so, you could do it using the slice method in python like so:
`
str = "something - something else (another thing) [even more things]"
str_to_cut = "something - something else"
print ("Original string: " str)
print(len(str_to_cut))
# slicing string characters after position len
res_str = str[:21]
print ("String after removal of character: " res_str)
`
CodePudding user response:
If your input does not contain parenthesis inside your square brackets, you could just use rfind()
to find the position of the last "("
occurring in the string. Then, slicing the initial string until that position (minus one to account for the space) will give you the desired output:
s = "something - something else (another thing) [even more things]"
print(repr(s[:s.rfind("(") - 1]))
# 'something - something else'