Remove the group of rows based on the condition of rows-CodePudding

I have a dataframe which has two columns, 'Group' and 'Sample Number' The column 'Group' has sample number '11' which is UNIQUE. and each group will have only one '11' Sample Number, followed by the sample numbers in range of 21 to 29 ( for example, 21, 22 23, 24, 25, 26, 27 , 28 , 29) and followed by the sample numbers in range of 31 to 39 (for example, 31, 32, 33, 34, 35, 36, 37, 38, 39). Hence each group should have one '11' sample number, at least one sample number in the range of 21 to 29 and at least one sample number in the rande of 31 to 39.

I wish to compute in such a way that my code goes through each group and

Check if there is a sample number 11 in the group or not.
Check if there is at least one sample number in the range of 21 to 29 .
Check if there is at least one sample number in the range of 31 to 39

If any of these three conditions does not match then the code removes the entire group from the dataframe

Below is the dataframe in table format:

Group	Sample_Number
Z007	11
Z007	21
Z007	22
Z007	23
Z007	31
Z007	32
Z008	11
Z008	31
Z008	32
Z008	33
Z009	11
Z009	21
Z009	22
Z009	23
Z010	21
Z010	22
Z010	23
Z010	24
Z010	31
Z010	32
Z010	33
Z010	34

df = pd.DataFrame([[Z007, 11],[Z007, 21] , [Z007, 22], [Z007, 23], [Z007, 31],[Z007, 32],[Z008, 11],[Z008, 31],[Z008, 32],[Z008, 33],[Z009, 11],[Z009, 21],[Z009, 22],[Z009, 23], [Z010, 21],[Z010, 22],[Z010, 23], [Z010, 24],[Z010, 31],[Z010, 32],[Z010, 33],[Z010, 34], columns=['Group', 'Sample_Number'])

The code should remove the group 'Z008' as it does not have the sample number in the range of 21 to 29. It should remove the group 'Z009' as it does not have the sample number in the range of 31 to 39. Also it should remove the group 'Z010' as it does not have the sample number '11'.

Expected answer is below:

Group	Sample_Number
Z007	11
Z007	21
Z007	22
Z007	23
Z007	31
Z007	32

I could do it only for sample number 11 but struggling to do the same for the other sample numbers in the range of (21 to 29 ) and (31 to 39), below is the code for sample number 11

invalid_group_no = [i for i in df['Group'].unique() if
               df[df['Group']== i]["Sample_Number"].to_list().count(11)!=1]

Can anyone please help me with the other sample numbers? Please feel free to implement your own ways. Any help is appreciated.

CodePudding user response：

Try this:

groups = set(df['Group'][df['Sample_Number'] == 11]) & set(df['Group'][df['Sample_Number'].isin(range(21,30))]) & set(df['Group'][df['Sample_Number'].isin(range(31,40))])
df = df[df['Group'].isin(groups)]


   Group    Sample_Number
0   Z007               11
1   Z007               21
2   Z007               22
3   Z007               23
4   Z007               31
5   Z007               32