Home > Enterprise >  Replace conditional values of a dataframe in multiple columns
Replace conditional values of a dataframe in multiple columns

Time:07-06

I have a dataframe with multiple columns like for example [5745 rows x 112 columns]. I would like to replace certain values of some columns . There are a lot of questions having a very similar problem to solve but I did not find a solution that worked for me.

Background: I plot my dataframe in Bokeh using pcolormesh. When having "0" values the mesh plots "0" values a white field color. This irritates the visual interpretation. Therefor I would like to replace these zeros with a very small value, lets say 1e-15. Pcolormesh then draws these fields using the first color of the map.

Pcolormesh plot with "0" in the dataset:                  Pcolormesh plot with "0" replaced with a very                                                                                    small value like 1e-15:
enter image description here                                                                enter image description here

The following represents a very small example dataframe for test and understanding porposes -with the real huge dataframe I do not want to mention all the column names so I tried it with 'iloc':

import pandas as pd

df = pd.DataFrame({'a':[1, 0, 2, 3],
                   'b':[3, 1, 1, 1],
                   'c':[1, 2, 1, 0],
                   'd':[2, 1, 0, 0],
                   'e':[1, 0, 0, 0],
                   'f':[1, 1, 0, 1],
                   'g':[1, 1, 0, 0],
                   'h':[0, 0, 0, 0]})

df.iloc[:,-4:-1][df.iloc[:,-4:-1]< 1e-15] = 1e-15
df

causing a warning:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

but as I understand,'loc' will not work as long as I don't write the specific column names (I dont want -these are too many in the real dataframe). And the warning also comes as 'iloc' is not able to replace a value of a dataframe in terms of manipulating it.

That is why I tried to really replace (in fact manipulate) the values of my dataframe with 'replace' which according to the pandas description should work for dataframes

df[:,-4:-1] = df[:,-4:-1].replace(< 1e-15, =1e-15, inplace=True)

which causes a syntax error:

    df[:,-4:-1] = df[:,-4:-1].replace(< 1e-15, =1e-15, inplace=True)
                                      ^
SyntaxError: invalid syntax

or

df.replace({-4:-1}(to_replace[:,-4:-1]< 1e-15), 1e-15)

what leads to a name error:

    df.replace({-4:-1}(to_replace[:,-4:-1]< 1e-15), 1e-15)

NameError: name 'to_replace' is not defined

I am sure there is just a missspelling but I do not find it. Do you see it?

Thanks!

CodePudding user response:

Use:

df.iloc[:,-4:-1] = df.iloc[:,-4:-1].clip(lower=1e-15)

Or:

df.iloc[:,-4:-1] = df.iloc[:,-4:-1].mask(df.iloc[:,-4:-1]< 1e-15, 1e-15)
  • Related