Home > Enterprise >  How to extract words starting with letters from A to L from a text in Python?
How to extract words starting with letters from A to L from a text in Python?

Time:12-20

Language: Python

So I extracted named entities from a text without using external libraries. Then I wanted to create two lists, one for the names starting with A-L; the other for the names starting with M-Z. I wrote this code, but it did not return any output. Anyone sees what's wrong here? Note that I am a beginner in coding in general and might not be familiar with the terminology.

AL = []
MZ = []
for match in matches:
    if ((match[0] >= "A") and (match[0] <= "L")):
        AL.append(match)
        print("Words between A-L are:  ", AL)
    elif ((match[0] >= "M" and match[0] <= "Z")):
        MZ.append(match)
        print("Words between M-Z are: ", MZ)

CodePudding user response:

So look at the following example:

def filter_words(words):
    AL = []
    MZ = []
    unclassified = []
    for word in words:
        if ((word[0] >= "A") and (word[0] <= "L")):
            AL.append(word)
        elif ((word[0] >= "M" and word[0] <= "Z")):
            MZ.append(word)
        else:
            unclassified.append(word)
    return AL, MZ, unclassified
    

AL, MZ, unclassified = filter_words(["Al", "Bob", "Todd", "Zack", "todd", "zack"])

print(AL)
print(MZ)
print(unclassified)

OUTPUT

['Al', 'Bob']
['Todd', 'Zack']
['todd', 'zack']

Depending on your requirements, you may want to call word.upper() before processing the if statements given that - as you can see - if a name starts with a lower case will be unclassified

CodePudding user response:

I guess the main problem you may have is that you are checking for upper case values and if match[0] is in lower case it won't work

Also you are printing at every iteration you should wait for the entire loop to run and then print. Here

AL= []
MZ = []
for match in matches:
    if ((match[0].lower() >= "a") and (match[0].lower() <= "l")):
        AL.append(match)

    elif ((match[0].lower() >= "m" and match[0].lower() <= "z")):
        MZ.append(match)
        
print("Words between A-L are:  ", AL)
print("Words between M-Z are: ", MZ)

If this still doesn't work please share the match object too. Also your code doesn't account situation when neither of the statements are true.

CodePudding user response:

Here you have to take care of the word case:

matches = ['Arbre', 'Amie', 'Maison', 'Ligne', 'Zebre', 'Maths']

AL = []
MZ = []
for match in matches:
    match = match.upper()
    if ((match[0] >= "A") and (match[0] <= "L")):
        AL.append(match)
    elif ((match[0] >= "M" and match[0] <= "Z")):
        MZ.append(match)
        
print("Words between A-L are:  ", AL)
print("Words between M-Z are: ", MZ)

output is:

Words between A-L are:   ['ARBRE', 'AMIE', 'LIGNE']
Words between M-Z are:  ['MAISON', 'ZEBRE', 'MATHS']
  • Related