Home > Software engineering >  selecting row of data based on different column of data using python
selecting row of data based on different column of data using python

Time:03-30

i am trying to filter the data from a .csv file where there are different columns, as shown below-:

enter image description here

and the desired result shown is according to the annotation column should contain "human" or homosapians" the the list shown.

enter image description here

CodePudding user response:

Get groups if match conditions by Series.isin and then again filter original DataFrame:

df =df[df['extId'].isin(df.loc[df['Annotation'].isin(['human','homosapians']), 'extId'])]

Or test if at least one value match in GroupBy.transform with GroupBy.any:

df =df[df['Annotation'].isin(['human','homosapians']).groupby(df['extId']).transform('any')]

CodePudding user response:

df[(df['Annotation'] == 'human') | (df['Annotation'] == 'homosapians')]
  • Related