I am working in Pandas Python. I am trying to select specific rows based on specific condition. From the below dataset, I want the system groups which has Type 1 in it. System groups which don't have type 1 can be ignored.
System | Type |
---|---|
A | 1 |
A | 2 |
A | 2 |
A | 3 |
B | 1 |
B | 2 |
C | 2 |
D | 3 |
Required Output
System | Type |
---|---|
A | 1 |
A | 2 |
A | 2 |
A | 3 |
B | 1 |
B | 2 |
System A and B is obtained in the required output becuase it contain the Type 1 value. C and D groups has been ignore due to no Type 1 in them. I am trying to do with groupby but unable to extend this function to check for presence type 1 in it in the condition. Please help
Code to generate dataframe
import pandas as pd
data = [['A', 1], ['A', 2], ['A', 2],['A',3],['B',1],['B',2],['C',2],['C',3],['D',3]]
df = pd.DataFrame(data, columns=['System', 'Type'])
CodePudding user response:
df[df['System'].isin(df[df['Type'] == 1]['System'])]
System Type
0 A 1
1 A 2
2 A 2
3 A 3
4 B 1
5 B 2
First, you filter the df to only rows with Type=1, and select the System
column. Then you filter the df to only includes rows where System
is in that list
It might be faster to get the values into a set and search in it
df[df['System'].isin(set(df[df['Type'] == 1]['System'].values))]
CodePudding user response:
let's say the table variable is df . Then the code will be
sys = list(df[df['Type'] == 1]['System'].values)
ans = df[df['System'].isin(sys)]
ans is your preferred table. I am sure there are better ways but hopefully this works.
CodePudding user response:
You can use where clause
it will be much faster than conditional selection
selectedvariables=df.where(df['Type']==1)["System"].dropna()
print(df[df["System"].isin(selectedvariables)])