Home > front end >  Pandas find an exact number of value among several
Pandas find an exact number of value among several

Time:09-21

I am trying write a code that can read from a dataframe that has thousands of rows like:

name      value 
abc123     wrd
abc123     wrd
abc123     wrd
abc987     wrd
abc987     wrd
abc987     bbs
..         ..

I want the code to show me the name that has only the value wrd, for example in this case after the code works it should only show the result as abc123, since abc987 has both wrd and bss as value.

What I did so far is:

import pandas as pd

result = df[df['value'].isin(['wrd', 'bbs'])]

this results in both abc123 and abc987

result = df[df['value'].isin(['wrd'])]

this results in again both abc123 and abc987

What should I use instead to get name of the ones that only has the value as wrd?

CodePudding user response:

Fitler out group if not equal wrd with test not equal scalar by Series.ne and not matched names by Series.isin:

result = df[~df['name'].isin(df.loc[df['value'].ne('wrd'), 'name'])]
print (result)
     name value
0  abc123   wrd
1  abc123   wrd
2  abc123   wrd

Details:

print (df.loc[df['value'].ne('wrd'), 'name'])
5    abc987
Name: name, dtype: object()

print (~df['name'].isin(df.loc[df['value'].ne('wrd'), 'name']))
0     True
1     True
2     True
3    False
4    False
5    False
Name: name, dtype: bool()

Or use GroupBy.transform with GroupBy.all for test if match all values:

result = df[df['value'].eq('wrd').groupby(df['name']).transform('all')]
print (result)
     name value
0  abc123   wrd
1  abc123   wrd
2  abc123   wrd

print (df['value'].eq('wrd').groupby(df['name']).transform('all'))
0     True
1     True
2     True
3    False
4    False
5    False
Name: name, dtype: bool

CodePudding user response:

try:

df
    name    value
0   abc123  wrd
1   abc123  wrd
2   abc123  wrd
3   abc987  wrd
4   abc987  wrd
5   abc987  bbs

df1 = df.groupby('name')['value'].apply(lambda x: list(set(x))).reset_index()
df1['value'] = df1['value'].map(lambda x: 1 if x == ['wrd'] else 0)

df1
    name    value
0   abc123  1
1   abc987  0

df1.loc[df1['value'].eq(1)]
    name    value
0   abc123  1
  • Related