Home > other >  How to use .loc over a column with lists?
How to use .loc over a column with lists?

Time:12-05

this is my df:

    feature_name    combo                           p_val   *
0   VC9             [rest_closed, immediate_recall] 0.0053  **
1   VC9             [rest_music, immediate_recall]  0.0345  *
2   VC9             [rest_wonder, rest_closed]      0.0010  ***
3   VC9             [rest_wonder, rest_music]       0.0043  **
4   VC9             [rest_wonder, rest_open]        0.0075  **
5   Theta           [rest_closed, immediate_recall] 0.0098  **
6   Theta           [rest_wonder, rest_closed]      0.0038  **
7   Theta           [statements, rest_closed]       0.0187  *
8   Gamma           [rest_closed, clock]            0.0230  *
9   Gamma           [rest_closed, d1]               0.0111  *
10  Gamma           [rest_closed, immediate_recall] 0.0155  *
11  Gamma           [rest_closed, nb1]              0.0396  *
12  Gamma           [rest_wonder, rest_closed]      0.0065  **
13  Gamma           [statements, rest_closed]       0.0175  *

I an trying to reach the p_val through the feature_name & combo. meaning - I want to insert for examplt 'VC9' and [rest_closed, immediate_recall] and to get the matching p_val. everything I tried failed... this is what I have right now -

for feature in features:
    
    for comb in combinations:
        pval = df_sigs.loc[(df_sigs['feature_name'].isin([feature])) & (df_sigs['combo'].isin([comb])), df_sigs['p_val']]

And this is the error I get:

TypeError: unhashable type: 'list'

(I want to print the p_val over a plot so I don't need it for more than the one loop)

when I tried other things I also got this error many time:

ValueError: Lengths must match to compare

(for example when I used np.where)

I seriously tried anything I can think of - creating another column of strings that combines the two elemnts of the lists.. unlisting, turning the list to a tuple. I feel like I am missing something very basic.

CodePudding user response:

Instead of using .loc, you can use the in-built __getitem__ method of pandas, like so

for feature in features:
    
    for comb in combinations:
        tmp = df_sigs[(df_sigs['feature_name'] == feature) & (df_sigs['combo'] == comb)]
        pval = tmp['p_val']

CodePudding user response:

Here is an example of how you can do this with loc.

 # Get the p_val for a given feature_name and combo
    feature_name = 'VC9'
    combo = ['rest_closed', 'immediate_recall']
    p_val = df.loc[(df['feature_name'] == feature_name) & (df['combo'] == combo), 'p_val'].iloc[0]
    print(p_val)  # Output: 0.0053
  • Related