Home > Software engineering >  Pandas: How to count values of mixed character cases in different columns in Python?
Pandas: How to count values of mixed character cases in different columns in Python?

Time:10-10

I have a dataframe with different columns

COUNTRY COUNTRY1 COUNTRY2 hp
Italy uk france 50
uk france italy 150

I would like to count the countries in different columns by row(without using melt) to get and output like this:

Italy 2

France 2

Uk 2

I tried different options as loc and groupby but it doesn't work.

Thank you

CodePudding user response:

the suggestion of @Henry Yik is way more efficient, and in combination with a groupby can solve your problem. also, try this: def count_entries(df, col_name):

    country_count = {}
    
    # Extract column from DataFrame: col
    col = df[col_name]
    
    # Iterate over country column in DataFrame
    for entry in col:
        # If the country is in langs_count, add 1
        if entry in country_count.keys():
            country_count[entry] = country_count[entry]   1 
        # Else add the language to langs_count, set the value to 1
        else:
            country_count[entry] = 1

    return country_count

# Call count_entries(): result
for col_name in df.columns():
    result = count_entries(df, co_name)

CodePudding user response:

You can filter the COUNTRY* columns by .filter(), then .stack() to a single column with the country names.

As your country names have mixed character letter cases, you can change country names to Title case (first letter capital letter) by str.title() before we count the occurences with .value_counts(), as follows:

df.filter(like='COUNTRY').stack().str.title().value_counts()

Result:

Italy     2
Uk        2
France    2
dtype: int64
  • Related