Home > Software design >  How to select all available indices in a dataframe column in Python
How to select all available indices in a dataframe column in Python

Time:05-18

I have a dataframe df of multiple variables, with a number of irregularly indexed entries.

Let's say like this:

        X  Y  
0    60.0  0   
1    63.0  10      
4    80.0  10             
5    46.0  1            
9    73.0  10

[5 rows x 2 columns]

Now, say that I have a list of externally provided indices, indices, like:

indices = [0, 2, 4, 6, 9, 11]

How can I select all entries of df['X'] which are part of the indices? My desired output is

        X  
0    60.0          
4    80.0                          
9    73.0 

If I merely try to call df['X'][indices], Python rightfully complains:

KeyError: '[2, 6, 11] not in index'

Is there a simple way to do this?

CodePudding user response:

Use Index.isin():

print(df.loc[df.index.isin(indices), 'X'])
0    60.0
4    80.0
9    73.0
Name: X, dtype: float64

CodePudding user response:

You could try converting the df's indices as a new column, and filter those using .isin():

# convert original indexes to new column
df["original_index"] = df.index
    
# filter based on list values
indices = [0, 2, 4, 6, 9, 11]
df[df["original_index"].isin(indices)]

Output:

    x   y   original_index
0   60  0   0
4   80  10  4
9   73  10  9

Alternatively, you could also use .reset_index(), which will move the original indices to a new column, and re-index the index:

# convert original indexes to new column and re-index
df.reset_index(inplace=True)


# filter based on list values
indices = [0, 2, 4, 6, 9, 11]
df[df["index"].isin(indices)]

Output:

  index x   y
0   0   60  0
2   4   80  10
4   9   73  10
  • Related