Home > front end >  Find specific value knowing row pandas
Find specific value knowing row pandas

Time:05-13

I have a dataframe with this structure:

A indexer attr1_rank attr2_rank attr3_rank attr4_rank ... attrn_rank
P 1 2 1 3 4 ... n
S 2 1 2 4 3 ... n

How can i add a column with the next expected value where return_value is the name of the column based on value in indexer, indexer should be compared with attr1_rank and attr2_rank to return its header:

return_value
attr2
attr2

i have this code but return Index Error. var tmp is representing indexer

index_value = self._data.iloc[row, [2, 4]] == int(tmp)
col_name = self._data.columns[index_value]
col_name = col_name.removesuffix('_rank')
self._data.iloc[row, column   3] = col_name # assuming that 'return_value' column is 3 positions to the right

Update for @JAV solution, this is the code that I fit

   def setData(self, index, value, role=Qt.EditRole):
        based_columns = [6, 8, 10, 12]
        if role == Qt.EditRole:
            row = index.row()
            column = index.column()
            tmp = str(value)
            if column in based_columns:
                if column == 6 and tmp in self._data.columns.values.tolist():
                    index_no = self._data.columns.get_loc(tmp)
                    self._data.iloc[row, column   1] = self._data.iloc[row, index_no]
                    self._data.iloc[row, column] = tmp
                elif column in [8, 10, 12]:
                    self._data.iloc[row, column   1] = self._data.apply(self.index_match(row), axis=1)
                    self._data.iloc[row, column] = tmp
                self.dataChanged.emit(index, index)

    def index_match(self, row):
        for col in row[97:].index:
            if row[col] == row['indexer']:
                return col[:-5]
Traceback (most recent call last):
  File "helper_classes.py", line 171, in setData
    self._data.iloc[row, column   1] = self._data.apply(self.index_match(row), axis=1)
  File "helper_classes.py", line 176, in index_match
    for col in row[97:].index:
TypeError: 'int' object is not subscriptable

CodePudding user response:

You can use this if you have an undefined number of columns to loop through:

def index_match(row):
    for col in row[1:].index:
        if row[col] == row['indexer']:
            return col[:-5] # trimming off _rank

df['return_value'] = df.apply(index_match, axis=1)

CodePudding user response:

Here's a one liner that does it for you.

df['return_value']=df.apply(lambda x: 'attr2' if x['indexer']==x['attr2_rank'] else ('attr1' if x['indexer']==x['attr1_rank'] else None ), axis =1)

Solution if the number of columns are large:

def get_return_val(x):
    vals=set(x.loc[x.indexer == x].index)- {'indexer'}
    if len(vals):
        return [x.rstrip('_rank') for x in vals ][0]
    else:
        return None

df['return_value'] = df.apply(get_return_val, axis=1)

CodePudding user response:

df.apply(lambda x: list((dictionary:=x[[index for index in x.index if '_rank' in index]].to_dict()).keys())[list(dictionary.values()).index(x['indexer'])].replace("_rank",""), axis=1)

or

df.apply(lambda x: list((dictionary:=x[[index for index in x.index if '_rank' in index]].to_dict()).keys())[list(dictionary.values()).index(x['indexer'])][:-5], axis=1)

or

df.apply(lambda x: list((dictionary:=x.drop(['A', 'indexer']).to_dict()).keys())[list(dictionary.values()).index(x['indexer'])][:-5], axis=1)

CodePudding user response:

It was easier than I thought for my functionality.

    for x in range(initial_column, end_column):
        if self._data.iloc[row, x] == int(tmp):
            index_value = x
            break

    col_name = self._data.columns[index_value]
    col_name = col_name.removesuffix('_rank')
    self._data.iloc[row, column   1] = col_name
  • Related