Home > Software design >  Pandas: filter on grouped and aggregated dataframe
Pandas: filter on grouped and aggregated dataframe

Time:11-12

I have a which is based on a read-in excel list. The data has multiple columns and rows with one unique identifier. I want to plot the data through a PyQT interface based on some user selection (checkboxes), but I cannot select one unique row for plotting.

The data looks like this:

| Experiment | Data 1 | Data 2   |
| --------   | ------ | -------- |
| Exp1       |    0   |    1     |
| Exp2       |    0   |    2     |
| Exp3       |    1   |    2     |
| Exp1       |    1   |    2     |
| Exp3       |    2   |    2     |

After

df.groupby('Experiment').agg(list) 

I get this:

| Experiment | Data 1 | Data 2   |
| ---------- | ------ | -------- |
| Exp1       | [0, 1] | [1, 2]   |
| Exp2       | [0]    | [2]      |
| Exp3       | [1, 2] | [2, 2]   |

I can use this for plotting e.g. with . However, after the user makes a selection, only that specific experiment is supposed to be plotted (e.g. Exp3).

I tried filtering on the aggregated list with

.filter(lambda x: x['Culture ID']=='Exp3') 

but it says that 'function' object is not iterable and I have a feeling this is the wrong approach.

Is there a way for me to get for example the index of the Experiment name (e.g. Exp3) so that I can access it this way or can someone explain how I could filter or access one of the rows based on the string/experiment key?

CodePudding user response:

df.groupby('Experiment').agg(list).query('index == "Exp3"')

output:

              Data 1    Data 2
Experiment      
Exp3          [ 1 , 2 ] [ 2 , 2 ]
  • Related