Home > Mobile >  How to find maximum number of occurrences for a character in a cell in a column in a CSV table in Py
How to find maximum number of occurrences for a character in a cell in a column in a CSV table in Py

Time:02-09

I'm trying to find the maximum number of occurrences of '/' (a slash) in a cell in a column in a CSV file. Here's the table below. It has hundreds of rows.

Person Full Name Person CRD Number ID Number
Jack Johnson 32 / 54 / 57 / 87 5686
John Johnsen 11 / 22 6589
Luke Peterson 34 6978
Kyle Garcia 63 / 24 / 83 8957

Here's my code:

import pandas as pd

data = '/Users/myname/Downloads/tabularShell.csv'

df = pd.read_csv(data, index_col=0)

df1 = pd.DataFrame(df)['Person CRD Number']

df2 = df1.value_counts('/')

print(df2)

The output should be 3 because the maximum number of occurrences of '/' is 3 in a cell in the "Person CRD Number" column in the table shown above.

Thank you!

CodePudding user response:

You can use .str.count for this. For each item in the column, it returns how many of the specified character are in that colum. .max() will then select the largest value.

>>> df['Person CRD Number'].str.count('/').max()
3

CodePudding user response:

print(max(df['Person CRD Number'].str.count('/')))

output:

>>> 3

CodePudding user response:

This gives you better control if you want to do more operations on your count

import pandas as pd

# Ignore these lines, it is just to build the dataframe
data = [['Jack Johnson', '32 / 54 / 57 / 87', '5686'],
        ['John Johnsen', '11 / 22', '6589'],
        ['Luke Peterson', '34', '6978']]
df = pd.DataFrame(data)
df.columns = ['Person Full Name', 'Person CRD Number', 'ID Number']

# Define a small function to count the char in a string
def count_char(string, char=r'/'):
    return string.count(char)

# Apply the function to the CRD number and store in a new column
df['count'] = df['Person CRD Number'].apply(count_char)

# Get the maximum from the count
print(df['count'].max())
  •  Tags:  
  • Related