I have a long list containing a lot of sublists which exist of 2 "values", for instance
test=[["AAAGG1","AAAAA22"],["GGGGA1","AAGGA"],["GGGGG23","GGAGA6"]]
What i want, is to replace or remove the last digits. Therfore i have tried using a pretty long function:
def remove_numbers(index,newlist):
for com in index:
for dup in com:
if "1" in dup:
newlist.append(dup.replace("1",""))
elif "2" in dup:
newlist.append(dup.replace("2",""))
elif "3" in dup:
newlist.append(dup.replace("3",""))
elif "4" in dup:
newlist.append(dup.replace("4",""))
elif "5" in dup:
newlist.append(dup.replace("5",""))
elif "6" in dup:
newlist.append(dup.replace("6",""))
elif "7" in dup:
newlist.append(dup.replace("7",""))
elif "8" in dup:
newlist.append(dup.replace("8",""))
elif "9" in dup:
newlist.append(dup.replace("9",""))
else:
newlist.append(dup)
i created an empty list and called out the function
emptytest=[]
testfunction=remove_numbers(test,emptytest)
when i call out the emptytest my output is the following
['AAAGG', 'AAAAA', 'GGGGA', 'AAGGA', 'GGGGG3', 'GGAGA']
The problem is that it is now a single list and when there are two numbers in the end that are not the same, they are not all removed/replaced. I need the sublists to remain intact.
does anybody know of a solution for this?
Sorry if it is a simple question since i am not that experienced with python yet, but i couldn't find a suitable solution on the web or an existing forum.
CodePudding user response:
What you need is to use a regex for replacing the numbers and not manually identifying everything. The whole thing can be achieved by 2 lines below.
import re
processed = [[re.sub(r"\d $","",n) for n in t] for t in test]
print(processed)
Gives a result
[['AAAGG', 'AAAAA'], ['GGGGA', 'AAGGA'], ['GGGGG', 'GGAGA']]
Here we used a regex "\d $"
which basically matches a numerical pattern at end of the string. If such a pattern is identified, then we replace it with empty.