Python - How can I count the values from a Pandas DataFrame Column that doesn't match a string?-CodePudding

I have a Pandas DataFrame like this:

ID    ID_CHILD
--    -----------
0     'C40998602'
1     'A25590024'
2     '         '
3     '         '
4     'B65217893'
5     '         '
6     'A81247804'

I have the following code that shows the counts for the records that contain "blank space characters" and the total number of records of the whole DataFrame:

print("Number of records without child ID: ", dataFrame['ID_CHILD'].value_counts()['         '])
print("Total number of records           : ", dataFrame['ID'].count())

# Output:
# Number of records without child ID: 3
# Total number of records           : 7

I need to show another line with print() similar to "blank space characters" records count but as opposite ("Non-blank spaces") as follows:

# Number of records with child ID: 4

Is there a similar method that can return the number of records comparing an unmatching the "9 blank spaces"?

CodePudding user response：

You can try this but there's probably something better out there:

dataFrame['ID_CHILD'].str.replace(' ', pd.NA).notnull().sum()

CodePudding user response：

dataframe[ dataframe['ID_CHILD']!='         '].shape[0]

basically says get the length of the index of the parts of the dataframe where the ID_CHILD column isn't 9 blank spaces

CodePudding user response：

try:

print("Number of records without child ID: ", dataFrame['ID_CHILD'].str.count(r'\s ').sum())
print("Number of records with child ID   : ", dataFrame['ID_CHILD'].str.count(r'\S ').sum())
print("Total number of records           : ", dataFrame['ID'].count())

Result:

Number of records without child ID:  3
Number of records with child ID   :  4
Total number of records           :  7