Although it seems very detrimental, I am having a hard time getting the index of a dataframe for a certain string or value at a random position in the dataframe.
I made an example dataframe:
fruits = {
'column1':["Apples","Pears","Bananas","Oranges","Strawberries"],
'column2':[1,2,3,4,5],
'column3':["Kiwis","Mangos","Pineapples","Grapes","Melons"]
}
df = pd.DataFrame(fruits)
column1 column2 column3
0 Apples 1 Kiwis
1 Pears 2 Mangos
2 Bananas 3 Pineapples
3 Oranges 4 Grapes
4 Strawberries 5 Melons
Now I want to get the position index of Mangos, without having the knowledge in which column or row it exists. So far I succeeded in getting the row index:
print(df.loc[df.isin(["Mangos"]).any(axis=1)].index)
Which results in:
Int64Index([1], dtype='int64')
But now I would also like to retrieve the column index or column name.
This thread is a very simplified version of Get column name where value is something in pandas dataframe, but I could not figure out the code using the other thread.
CodePudding user response:
You can simply do:
df.columns[df.isin(['Mangos']).any()])
Index(['column3'], dtype='object')
Or to just get the column name:
df.columns[df.isin(['Mangos']).any()][0]
# column3
To get the index of the column, try:
df.columns.get_indexer(df.columns[df.isin(['Mangos']).any()])
# [2]
CodePudding user response:
Stack
the dataframe to reshape into multiindex series, then use boolean indexing to get the index
s = df.stack()
s[s == 'Mangos'].index
MultiIndex([(1, 'column3')])
CodePudding user response:
You can try a simple search yourself.
coordinates = []
indices = df.index
columns = df.columns
for index in indices: # df is the DataFrame
for col in columns:
if df[col][index] == 'Mangos':
coordinates.append((index, col))
coordinates
Output:
[(1, 'column3')]
CodePudding user response:
You can use np.where
as well.
Code:
import numpy as np
[(df.index[i], df.columns[c]) for i, c in zip(*np.where(df.isin(['Mangos'])))]
Output:
[(1, 'column3')]