DataFrame:
I want to compare these 2 columns and extract the count of matched and un matched rows.
result will be like
Matched = 3
Un matched = 2
CodePudding user response:
Compare values first by Series.eq
and count by Series.value_counts
, then replace True, False
indices:
s = (df.input_number.eq(df.org_number)
.value_counts()
.rename({True:'mach', False: 'no match'}))
If need DataFrame:
sdf1 = (df.input_number.eq(df.org_number)
.value_counts()
.rename({True:'mach', False: 'no match'})
.rename_axis('state')
.reset_index(name='count'))
CodePudding user response:
try this:
import pandas as pd
df = pd.DataFrame(data={'input_number':[123,253,458,479,1564],'org_number':[1234,253,458,478,1564]})
matched, un_matched = df[df['input_number']==df['org_number']].shape[0],df[df['input_number']!=df['org_number']].shape[0]
print("Matched = {}\nUn matched = {}".format(matched,un_matched))
CodePudding user response:
Another way, boolean index, conditionally allocate match status. Thereafter get_summies and sum down the columns. Code below
pd.get_dummies(np.where(df['code_207']==df['code_207a'],'matched','unmatched')).sum(0)