Home > Software design >  How to filter string?
How to filter string?

Time:12-11

I have a list of strings which I have to filter in python.

list=["ssssssssss Address: xxxxxxxxxx",
       "sssssssss, xxxxxxxxxxx",
       "sssssssssssssss, xxxxxxxxxxxxx",
       "xxxxxxxxxxxx,",
       "xxxxxx",
       "www"]

I want desire output

list=["ssssssssss Address: xxxxxxxxxx",
       "sssssssss, xxxxxxxxxxx",
       "sssssssssssssss, xxxxxxxxxxxxx",
       "xxxxxxxxxxxx,",
       "xxxxxx",
       "www"]

My code

regex = re.compile("[^a-zA-Z0-9!@#$&()\\-`. ,/\"] ")
for i in list:
   print(" ".join(regex.sub(' ', i).split()))

My output

"ssssssssss Address: xxxxxxxxxx",
       "sssssssss, xxxxxxxxxxx",
       "sssssssssssssss, xxxxxxxxxxxxx",
       "xxxxxxxxxxxx,",
       "xxxxxx",
       "www"

I want to remove Himansu if it comes between Non English character (eg: पत्ता स नं Himanshu अष्टविनायक).

CodePudding user response:

Try with this code:

import re
list = ["पत्ता स नं Himanshu अष्टविनायक Address: sr no94/1B/1/2/3",
        "चाळ, जय foo boo, बस स्टोप जवळ, ashatvinayak chal, jay bhavani",
        "पिंपळे गुरव, पुणे, महाराष्ट्र, 411027 nagar, near bus stop, Pimple",
        "पिं Gurav, Pune, Maharashtra,",
        "411027",
        "www"]
list2 = []
pattern = "[^a-zA-Z0-9!@\s:#$&()\\-`. ,/\"] [, ]*(?!.*[^a-zA-Z0-9!@\s:#$&()\\-`. ,/\"] [, ]*)"
for i in list:
    st = re.findall(pattern,i)
    if st:
        list2.append(i[i.index(st[0]) len(st[0]):])
    else:
        list2.append(i)
print(list2)

output :
['Address: sr no94/1B/1/2/3', 'ashatvinayak chal, jay bhavani', '411027 nagar, near bus stop, Pimple', 'Gurav, Pune, Maharashtra,', '411027', 'www']

  • Related