Given a list of names
names = [
"bob john",
"billy james",
"bob joe",
"bob joe henry",
"bob",
"bob john martin",
"billy james phillip",
"billy james phillip mark"
]
How can I return a list of all the full names? i.e.
full_names = [
"bob joe henry",
"bob john martin",
"billy James phillip mark"
]
Would using a trie be appropriate in this scenario?
CodePudding user response:
I think this is what you are looking for. based on the output it looks like they are asking for you to remove the entries where the names are the beginnings of another name. If you look at the full_names list, it is absent any names that had only the first portion of a name already in the list.
This short script will give you the same output as you requsted
names = [
"bob john",
"billy james",
"bob joe",
"bob joe henry",
"bob",
"bob john martin",
"billy james phillip"
]
end = len(names) - 1
while end >= 0:
name = names[end]
for i,x in enumerate(names):
if i != end and name in x:
del names[end]
break
end -= 1
print(names)
CodePudding user response:
if your definition of the full name is three words you can simply split the name by a space and check for the length
ex -
full_names = [n for n in names if len(n.split(" ")) == 3]
But if you want to really find full names across a big list. You need to find a big list of last names and first names. then check them against your name list.
Or you can train a spacy model using a name list and use it to identify first and last names. https://spacy.io/