I have a dataframe which is based on a read-in excel list. The data has multiple columns and rows with one unique identifier. I want to plot the data through a PyQT interface based on some user selection (checkboxes), but I cannot select one unique row for plotting.
The data looks like this:
| Experiment | Data 1 | Data 2 |
| -------- | ------ | -------- |
| Exp1 | 0 | 1 |
| Exp2 | 0 | 2 |
| Exp3 | 1 | 2 |
| Exp1 | 1 | 2 |
| Exp3 | 2 | 2 |
After
df.groupby('Experiment').agg(list)
I get this:
| Experiment | Data 1 | Data 2 |
| ---------- | ------ | -------- |
| Exp1 | [0, 1] | [1, 2] |
| Exp2 | [0] | [2] |
| Exp3 | [1, 2] | [2, 2] |
I can use this for plotting e.g. with pyqtgraph. However, after the user makes a selection, only that specific experiment is supposed to be plotted (e.g. Exp3).
I tried filtering on the aggregated list with
.filter(lambda x: x['Culture ID']=='Exp3')
but it says that 'function' object is not iterable
and I have a feeling this is the wrong approach.
Is there a way for me to get for example the index of the Experiment name (e.g. Exp3) so that I can access it this way or can someone explain how I could filter or access one of the rows based on the string/experiment key?
CodePudding user response:
df.groupby('Experiment').agg(list).query('index == "Exp3"')
output:
Data 1 Data 2
Experiment
Exp3 [ 1 , 2 ] [ 2 , 2 ]