Home > Blockchain >  Pandas lambda function works only on single column not multiple
Pandas lambda function works only on single column not multiple

Time:12-10

I'm trying to apply a simple function (eliminating spaces) across multiple columns of a pandas DataFrame. However, while the .apply() method works properly on a single column, it doesn't work properly over multiple columns. Example:

#Weird Pandas behavior
######
#Input
df = pd.DataFrame ({'a' : ["7  7","5 3"],
                 'b' : ['f o', 'b  r'],
                 'c' : ["77","53"]})
print(df)

      a     b   c
0  7  7   f o  77
1   5 3  b  r  53

df[["a","b"]]=df[["a","b"]].apply(lambda x: x.replace(" ",""))
print(df)

      a     b   c
0  7  7   f o  77
1   5 3  b  r  53


df2=copy.deepcopy(df)
print(df2)

      a     b   c
0  7  7   f o  77
1   5 3  b  r  53

df2["a"]=df2["a"].apply(lambda x: x.replace(" ",""))
print(df2)

    a     b   c
0  77   f o  77
1  53  b  r  53

As you can see, df doesn't change at all when I try to apply the "replace" operation to two columns, but the same dataset (or rather a copy of it) does change when I run the same operation on a single column. How can I remove spaces from two or more columns at once using the .apply() syntax?

I tried passing in the arguments '[a]' (nothing happens) and 'list(a)' (nothing happens) to df[].

CodePudding user response:

When you pass multiple columns, x is a pandas series, not the individual column values. You need to use .str.replace() to operate on each column.

df[["a","b"]]=df[["a","b"]].apply(lambda x: x.str.replace(" ",""))

CodePudding user response:

You can use the dataframe

  • Related