Hello I have a dataframe such as :
Species COL1 COL2 COL3 COL4 COL5
SP1 0 0 0 1-2 0-1-2
SP2 1-2 2 0 1 0
SP3 0-1 1 2 0 1-2
and I would like to add new columns to count for each row the number of specific unique values such as :
Species COL1 COL2 COL3 COL4 COL5 count_0 count_1-2 count_0-1-2 count_1 count_2
SP1 0 0 0 1-2 0-1-2 3 1 1 0 0
SP2 1-2 2 0 1 0 2 1 0 1 1
SP3 0-1 1 2 0 1-2 1 1 0 2 1
Does someone have na idea please ?
CodePudding user response:
You can use the value_counts()
method in the pandas library to count the number of occurrences of each unique value in each row of your dataframe.
# Loop through each row of the dataframe
for index, row in df.iterrows():
# Create a series object for the current row
series = pd.Series(row)
# Count the number of occurrences of each unique value in the row
counts = series.value_counts()
# Add the count values to the current row of the dataframe
df.loc[index, 'count_0'] = counts[0] if 0 in counts else 0
df.loc[index, 'count_1-2'] = counts['1-2'] if '1-2' in counts else 0
df.loc[index, 'count_0-1-2'] = counts['0-1-2'] if '0-1-2' in counts else 0
df.loc[index, 'count_1'] = counts[1] if 1 in counts else 0
df.loc[index, 'count_2'] = counts[2] if 2 in counts else 0
CodePudding user response:
Example
data = {'Species': {0: 'SP1', 1: 'SP2', 2: 'SP3'},
'COL1': {0: '0', 1: '1-2', 2: '0-1'},
'COL2': {0: '0', 1: '2', 2: '1'},
'COL3': {0: '0', 1: '0', 2: '2'},
'COL4': {0: '1-2', 1: '1', 2: '0'},
'COL5': {0: '0-1-2', 1: '0', 2: '1-2'}}
df = pd.DataFrame(data)
Code
df1 = (df.set_index('Species').apply(lambda x: x.value_counts(), axis=1)
.add_prefix('count_').fillna(0).astype('int'))
df1
count_0 count_0-1 count_0-1-2 count_1 count_1-2 count_2
Species
SP1 3 0 1 0 1 0
SP2 2 0 0 1 1 1
SP3 1 1 0 1 1 1
make desired output
concat df & df1
pd.concat([df.set_index('Species'), df1], axis=1)