How to check if all words in the list are all caps in python-CodePudding

I want to know if all words in the list are all capitalized, that is, all alphabetical characters are capitalized.

I tried this for a simple string to understand behavior of isupper():

>>> 'FDSFS BBIBIFBIBWEF ASDASD 112,32:/'.isupper()    
True

So I separated words in that sentence into list:

>>> sent = ['FDSFS','BBIBIFBIBWEF','ASDASD', '112','32:/']
>>> all([word.isupper() for word in sent])
False

So I checked the what was argument list to all() contained:

>>> [word.isupper() for word in sent]
[True, True, True, False, False]

Weirdly, isupper() returns False for strings containing no alphabets (that is made up of only numbers and special characters), but returns True if those strings are made to contain even a single capital character:

>>> '123'.isupper()
False
>>> '123A'.isupper()
True
>>> '123?A'.isupper()
True
>>> '123?'.isupper()
False
>>> ''.isupper()
False

Q1. Is there any design decision behind this behavior for isupper()?
Q2. How can I achieve what I want to do in the most pythonic and minimal way? (Maybe there is any other function that just checks if all alphabetical words in the input string are capital and don't bother about special characters, numbers, and empty strings at all? Or do I have to write one from scratch?)

CodePudding user response：

Q1:

As mentioned in the docs:

Return True if all cased characters in the string are uppercase and there is at least one cased character, False otherwise.

As you can see, it is checking if all characters in the string are uppercase not just the letters.

It's a similar implementation as this:

import string
def upper(s):
    return all([word in string.ascii_uppercase for word in s])

Q2:

Solution 1:

The way to solve this might be to use upper and check if it is equivalent to the original string:

>>> sent = ['FDSFS','BBIBIFBIBWEF','ASDASD', '112','32:/']
>>> all([word.upper() == word for word in sent])
True
>>>

Solution 2:

Or check if it's none of the characters are lowercase:

>>> sent = ['FDSFS','BBIBIFBIBWEF','ASDASD', '112','32:/']
>>> all([(not word.islower()) for word in sent])
True
>>>

Just realized @DaniMesejo posted this, credit to him.

Solution 3:

This could be done very elegantly with regex too:

>>> import re
>>> sent = ['FDSFS','BBIBIFBIBWEF','ASDASD', '112','32:/']
>>> expr = re.compile('^[^a-z]*$')
>>> all(map(expr.search, sent))
True
>>>

With map and a compiled regular expression, might be more efficient.

CodePudding user response：

Use the negation of str.islower:

sent = ['FDSFS','BBIBIFBIBWEF','ASDASD', '112','32:/']
result = [(not word.islower()) for word in sent]
print(result)

Output

[True, True, True, True, True]

The behaviour of isupper is explained in the documentation:

Return True if all cased characters in the string are uppercase and there is at least one cased character, False otherwise.

(Emphasis mine)

A cased character is:

Cased characters are those with general category property being one of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase).

See here for a nice explanation on how to identify cased characters.

CodePudding user response：

If you're looking to see if there are only upper alpha characters (minus whitespace) then you could use regex:

bool(re.match('[^A-Z\s] ', sent)

Or just include str.isalpha in the check:

sent.isupper() and sent.isalpha()