How can I check a string for two letters or more?-CodePudding

I am pulling data from a table that changes often using Python - and the method I am using is not ideal. What I would like to have is a method to pull all strings that contain only one letter and leave out anything that is 2 or more.

An example of data I might get:

115 19A6 HYS8 568

In this example, I would like to pull 115, 19A6, and 568.

Currently I am using the isdigit() method to determine if it is a digit and this filters out all numbers with one letter, which works for some purposes, but is less than ideal.

CodePudding user response：

Try this:

string_list = ["115", "19A6", "HYS8", "568"]
output_list = []

for item in string_list: # goes through the string list
    letter_counter = 0 
    for letter in item: # goes through the letters of one string
        if not letter.isdigit(): # checks if the letter is a digt
            letter_counter  = 1
    if letter_counter < 2: # if the string has more then 1 letter it wont be in output list
        output_list.append(item)

print(output_list)

Output:

['115', '19A6', '568']

CodePudding user response：

Here is a one-liner with a regular expression:

import re

data = ["115", "19A6", "HYS8", "568"]
out = [string for string in data if len(re.sub("\d", "", string))<2]
print(out)

Output:

['115', '19A6', '568']

CodePudding user response：

This is an excellent case for regular expressions (regex), which is available as the built-in re library.

The code below follows the logic:

Define the dataset.
Compile a character pattern to be matched. In this case, zero or more digits, followed by zero or one upper case letter, ending with zero of more digits.
Use the filter function to detect matches in the data list and output as a list.

For example:

import re

data = ['115', '19A6', 'HYS8', '568']
rexp = re.compile('^\d*[A-Z]{0,1}\d*$')
result = list(filter(rexp.match, data))

print(result)

Output:

['115', '19A6', '568']