Home > Blockchain >  filter data and replace values in pandas
filter data and replace values in pandas

Time:05-17

I am new to pandas and I am trying to replace some values on specific columns of a dataframe. My dataframe looks like this:

    c1  c2  st  mt  ast sr  c7  c8
0   a   a   2   1   4   2   a   a
1   b   b   3   3   3   3   b   b
2   c   c   1   1   2   4   c   c
3   d   d   3   3   1   2   d   d
4   e   e   2   3   2   1   e   e
5   f   f   5   5   5   5   f   f

in row 1, from columns 2 to 5 (st, mt, ast, sr) I have the number 3 in all columns and row 5 I have the number 5 in all columns. When I subset using the code below and try to replace 3 by 0 and 5 by 0 in these columns, the code return all the values in the row replaced by 0, not only the values in those specific columns. I tried:

 df[(df.st == 5) & (df.mt == 5) & (df.ast == 5) & (df.sr == 5)]=0

but I got:

c1  c2  st  mt  ast sr  c7  c8
0   a   a   2   1   4   2   a   a
1   0   0   0   0   0   0   0   0
2   c   c   1   1   2   4   c   c
3   d   d   3   3   1   2   d   d
4   e   e   2   3   2   1   e   e
5   0   0   0   0   0   0   0   0

I also tried:

df[(df.st == 5) & (df.mt == 5) & (df.ast == 5) & (df.sr == 5)].replace(5,0)

My desired output is:

 c1 c2  st  mt  ast sr  c7  c8
0   a   a   2   1   4   2   a   a
1   b   b   0   0   0   0   b   b
2   c   c   1   1   2   4   c   c
3   d   d   3   3   1   2   d   d
4   e   e   2   3   2   1   e   e
5   f   f   0   0   0   0   f   f

How can I achieve this goal?

CodePudding user response:

Use DataFrame.loc with columnsnames in list:

df.loc[(df.st == 5) & (df.mt == 5) & (df.ast == 5) & (df.sr == 5), ['st','mt','ast','sr']]=0

Another idea is working with only selected lists with comapare by 5 by DataFrame.eq and test if all values Trues by DataFrame.all:

cols = ['st','mt','ast','sr']
df.loc[df[cols].eq(5).all(axis=1) | df[cols].eq(3).all(axis=1), cols]=0

print (df)
  c1 c2  st  mt  ast  sr c7 c8
0  a  a   2   1    4   2  a  a
1  b  b   0   0    0   0  b  b
2  c  c   1   1    2   4  c  c
3  d  d   3   3    1   2  d  d
4  e  e   2   3    2   1  e  e
5  f  f   0   0    0   0  f  f

CodePudding user response:

If you want to ensure that you have all the same values in ['st', 'mt', 'ast', 'sr'] and that this value is in a given list, you can use 2 masks:

m1 = df[cols[1:]].eq(df[cols[0]], axis=0).all(1)
m2 = df[cols[0]].isin([3,5])
df.loc[m1 & m2, cols] = 0

output:

  c1 c2  st  mt  ast  sr c7 c8
0  a  a   2   1    4   2  a  a
1  b  b   0   0    0   0  b  b
2  c  c   1   1    2   4  c  c
3  d  d   3   3    1   2  d  d
4  e  e   2   3    2   1  e  e
5  f  f   0   0    0   0  f  f
  • Related