Home > database >  count number specific value within columns for each row in pandas
count number specific value within columns for each row in pandas

Time:12-04

Hello I have a dataframe such as :

Species COL1 COL2 COL3 COL4 COL5
SP1     0    0    0    1-2  0-1-2
SP2     1-2  2    0    1    0
SP3     0-1  1    2    0    1-2 

and I would like to add new columns to count for each row the number of specific unique values such as :

Species COL1 COL2 COL3 COL4 COL5  count_0 count_1-2 count_0-1-2 count_1 count_2 
SP1     0    0    0    1-2  0-1-2 3       1         1           0       0
SP2     1-2  2    0    1    0     2       1         0           1       1
SP3     0-1  1    2    0    1-2   1       1         0           2       1

Does someone have na idea please ?

CodePudding user response:

You can use the value_counts() method in the pandas library to count the number of occurrences of each unique value in each row of your dataframe.

# Loop through each row of the dataframe
for index, row in df.iterrows():
    # Create a series object for the current row
    series = pd.Series(row)

    # Count the number of occurrences of each unique value in the row
    counts = series.value_counts()

    # Add the count values to the current row of the dataframe
    df.loc[index, 'count_0'] = counts[0] if 0 in counts else 0
    df.loc[index, 'count_1-2'] = counts['1-2'] if '1-2' in counts else 0
    df.loc[index, 'count_0-1-2'] = counts['0-1-2'] if '0-1-2' in counts else 0
    df.loc[index, 'count_1'] = counts[1] if 1 in counts else 0
    df.loc[index, 'count_2'] = counts[2] if 2 in counts else 0

CodePudding user response:

Example

data = {'Species': {0: 'SP1', 1: 'SP2', 2: 'SP3'},
        'COL1': {0: '0', 1: '1-2', 2: '0-1'},
        'COL2': {0: '0', 1: '2', 2: '1'},
        'COL3': {0: '0', 1: '0', 2: '2'},
        'COL4': {0: '1-2', 1: '1', 2: '0'},
        'COL5': {0: '0-1-2', 1: '0', 2: '1-2'}}
df = pd.DataFrame(data)

Code

df1 = (df.set_index('Species').apply(lambda x: x.value_counts(), axis=1)
       .add_prefix('count_').fillna(0).astype('int'))

df1

         count_0    count_0-1   count_0-1-2 count_1 count_1-2   count_2
Species                     
SP1      3          0            1          0       1           0
SP2      2          0            0          1       1           1
SP3      1          1            0          1       1           1

make desired output

concat df & df1

pd.concat([df.set_index('Species'), df1], axis=1)
  • Related