Home > Net >  Pandas - Compare 2 columns in a dataframe and return count
Pandas - Compare 2 columns in a dataframe and return count

Time:03-08

DataFrame:

enter image description here

I want to compare these 2 columns and extract the count of matched and un matched rows.

result will be like

Matched = 3
Un matched = 2

CodePudding user response:

Compare values first by Series.eq and count by Series.value_counts, then replace True, False indices:

s = (df.input_number.eq(df.org_number)
                    .value_counts()
                    .rename({True:'mach', False: 'no match'}))

If need DataFrame:

sdf1 = (df.input_number.eq(df.org_number)
                       .value_counts()
                       .rename({True:'mach', False: 'no match'})
                       .rename_axis('state')
                       .reset_index(name='count'))

CodePudding user response:

try this:

import pandas as pd
df = pd.DataFrame(data={'input_number':[123,253,458,479,1564],'org_number':[1234,253,458,478,1564]})
matched, un_matched = df[df['input_number']==df['org_number']].shape[0],df[df['input_number']!=df['org_number']].shape[0]
print("Matched = {}\nUn matched = {}".format(matched,un_matched))

CodePudding user response:

Another way, boolean index, conditionally allocate match status. Thereafter get_summies and sum down the columns. Code below

pd.get_dummies(np.where(df['code_207']==df['code_207a'],'matched','unmatched')).sum(0)
  • Related