I have a Panda's dataframe like so:
colLabelA colLabelB ... colLabelZ
rowLabelA 10 0 0
specialRow 0 10 0
rowLabelB 20 0 10
...
rowLabelZ 0 0 20
Essentially I only know the row called specialRow
. What I need is to find a way to iterate through the entire dataframe and check all the columns for 0 (zero).
If a column has all zeroes except specialRow
, then that column by row cell needs to be made into a zero as well. Otherwise move to the next column and check that one.
So in the above example, only colLabelB has all zeroes except the specialRow
so that needs to be updated like so:
colLabelA colLabelB ... colLabelZ
rowLabelA 10 0 0
specialRow 0 0 0
rowLabelB 20 0 10
...
rowLabelZ 0 0 20
Is there a quick and fast way to do this?
The dataframes aren't huge but I don't want it to be super slow either.
CodePudding user response:
For each column, exclude the particular index, then check if all other values for that column is zero, if yes, then just assign 0
to such columns:
for col in df:
if df[df.index!='specialRow'][col].eq(0).all():
df[col] = 0
OUTPUT:
colLabelA colLabelB colLabelZ
rowLabelA 10 0 0
specialRow 0 0 0
rowLabelB 20 0 10
rowLabelZ 0 0 20
In fact df.index!='specialRow'
remains the same for all the columns, so you can just assign it to a variable and use it for each of the columns.
CodePudding user response:
Use drop
to drop the named row, then check for 0
with eq(0).all()
. Then you can update with loc
:
df.loc['specialRow', df.drop('specialRow').eq(0).all()] = 0
This works with more than one special rows too:
specialRows = ['specialRow']
df.loc[specialRows, df.drop(specialRows).eq(0).all()] = 0
Output:
colLabelA colLabelB colLabelZ
rowLabelA 10 0 0
specialRow 0 0 0
rowLabelB 20 0 10
rowLabelZ 0 0 20