I am learning Python and I am extracting some datas in a csv. My goal is to put one column, for example 'names' into a list. This, I know how to do it.
After, I would like to delete inccorect names into the list. For example :
lists = [Anderson, Byrne, Clark, Cooper, Davies, Evans, Miller, Moore, Thom@s, W!lson]
and I want :
lists = [Anderson, Byrne, Clark, Cooper, Davies, Evans, Moore]
Because some names has
or @
or !
inside. I don't want to change the character, I want to delete those names.
I tried, a classic if else but it didn't work.
CodePudding user response:
You can use str.isalpha
:
names = [
'Anderson',
'Byrne',
'Clark',
'Cooper',
'Davies',
'Evans',
' Miller',
'Moore',
'Thom@s',
'W!lson'
]
only_names = [n for n in names if n.isalpha()]
# only_names = list(filter(str.isalpha, names))
print(only_names)
CodePudding user response:
Assuming all elements in the list are strings, you can get the result you want with:
names = [
'Anderson',
'Byrne',
'Clark',
'Cooper',
'Davies',
'Evans',
' Miller',
'Moore',
'Thom@s',
'W!lson'
]
names_filtered = [
name for name in names if ''.join(c for c in name if c.isalpha()) == name
]
CodePudding user response:
You can use regular expression for it. Python supports a regular expression module, called re
.
For example, a name consists of only alphabets. You can use list comprehension and re to filter only valid names as follows:
import re
names = [
'Anderson',
'Byrne',
'Clark',
'Cooper',
'Davies',
'Evans',
' Miller',
'Moore',
'Thom@s',
'W!lson'
]
only_names = [n for n in names if re.match("^[A-Za-z]*$", n)]
print(only_names)
# ['Anderson', 'Byrne', 'Clark', 'Cooper', 'Davies', 'Evans', 'Moore']
For more information, check the re
module documentation.
CodePudding user response:
For filtering valid names:
names = ["john", "mich@el", "suraj"]
valid_names = [name for name in names if name.isalpha()]
print(valid_names)
Output: ['john', 'suraj']
As asked by OP in the comments below, for filtering valid emails with valid usernames:
emails = ["[email protected]", "[email protected]", "[email protected]"]
valid_emails = []
allowed = "._"
for email in emails:
username = email.split("@")[0]
for ch in allowed:
username = username.replace(ch, "")
if username.isalnum():
valid_emails.append(email)
print(valid_emails)
Output: ['[email protected]', '[email protected]']
Hope this helps!
CodePudding user response:
this code is is running without creating an additional list. it just modifies the list in-place by using a for loop:
lst = ['Anderson', 'Byrne', 'Clark', 'Cooper', 'Davies', 'Evans', ' Miller', 'Moore', 'Thom@s', 'W!lson']
for ind in range(len(lst) -1, -1, -1):
if not lst[ind].isalpha():
del lst[ind]
print(lst)