Trying to loop through a report and eliminate/hide/replace cell values if they are repeated in the row above. This is conditional to certain columns in the row but not the entire row as each row will contain at least 1 piece of data that is unique to the row. I know I am close but I'm missing my mark and looking for a nudge in the right direction. Trying to eliminate redundant information to increase legibility of the final report. Essentially what I am trying to do is:
for cell in row:
if column["column_name"] == (line above):
cell.value = " "
Because each row has a unique piece of data drop duplicates does not work. Once I can clear the intended column in each row where applicable I will expand the process to loop through and apply to other columns where the initial is blanked out. I should be able to work that out once the first domino falls. Any advice is appreciated.
I've tried
np.where(cell) = [iloc-1]
and
masking based on the same parameter.
I get errors that 'row' and 'iloc' are undefined or None of [Index (all content)] are in the [index].
CodePudding user response:
You can use shift() to compare the row elements. If I understand your issue then the example code below indicates an approach you can use (it replaces duplicated numbers by 0):
import pandas as pd
df = pd.DataFrame({ 'A': [1, 2, 2, 4, 5],
'B': ['a', 'b', 'c', 'd', 'e']
})
df['A'] = df['A'].where(df['A'] != df.shift(-1)['A'], 0)
print(df)