Home > Back-end >  split string by all non alphabetic character occurences in Python
split string by all non alphabetic character occurences in Python

Time:11-14

I'm trying to do the following function: We need to build a list out of a string. The list should only have alphabetical characters in it.

#if the input is the following string
mystring = "ashtray ol'god for,       shure!  i.have "

#the output should give a list like this:
mylist = ['ashtray','ol','god','for','shure','i','have']

No modules should be imported. I created the following function and it works, but I would be happy if someone could provide a better way to do it.

for ch in mystring:   
      if ch.isalpha() == False:
             mystring = mystring.replace(ch,' ') 
mylist = mystring.split()

by alphabetical character I mean all alphabetical characters present in UTF8, that means including arabic ,jewish chars etc.

CodePudding user response:

Try this code

mystring = "ashtray ol'god for,       shure!  i.have "

lst = []
mystr = ''
for i in mystring:
    temp = ord(i)
    if (65 <= temp <= 90) or (97 <= temp <= 122):
       mystr  = i
    else:
        if mystr:
            lst.append(mystr)
            mystr = ''

print(lst)

Or

mystring = "ashtray ol'god for,       shure!  i.have "

lst = []
mystr = ''
for i in mystring:
    if i in 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz':
       mystr  = i
    else:
        if mystr:
            lst.append(mystr)
            mystr = ''

print(lst)

Or (including Non-English characters)

mystring = "ashtray ol'god for,       shure!  i.have "

lst = []
mystr = ''
for i in mystring:
    if i.isalpha():
        mystr  = i
    else:
        if mystr:
            lst.append(mystr)
            mystr = ''

print(lst)

Output:

['ashtray', 'ol', 'god', 'for', 'shure', 'i', 'have']

Tell me if its not working...

  • Related