Home > Blockchain >  How to use Pandas apply() on a dataframe by using lambda functions?
How to use Pandas apply() on a dataframe by using lambda functions?

Time:07-20

Consider the following piece of code:

import pandas as pd
import numpy as np

N = 10
idx = np.linspace(0, 1, N)

labels = ["a", "b", "c"]
values = np.stack((np.random.rand(N), np.random.rand(N), np.random.rand(N))).transpose()

df = pd.DataFrame(index=idx, columns=labels, data=values)

Assume now that I want to subtract the value 3 from column "a" and the value 5 from column "b".

I can easily achieve that through the following:

cols = labels[0:2]
df.loc[:, cols] = df.loc[:, cols].subtract([3, 5])

However, if I use the following:

df.loc[:, cols].apply(lambda x: x.subtract([3, 5]))

I get a ValueError: Lengths must be equal.

If I use

df.loc[:, cols].apply(lambda x: x.subtract(3))

It works but it subtracts 3 from both the columns specified in cols.

Given the generality of the method apply(), I would like to understand how to use it when the type of x in the lambda function used in the apply() method is a dataframe and I want to do different things in different columns.

I am searching for the less verbose way, I am aware that I can do it through a for loop by iterating on cols.

CodePudding user response:

Use axis=1 for processing per rows, but it is slowier like vectorized first solution:

df.loc[:, cols] = df.loc[:, cols].apply(lambda x: x.subtract([3, 5]), axis=1)

You can check how it working with print:

#each Series has 2 values by rows
df.loc[:, cols].apply(lambda x: print (x), axis=1)

#each Series has all values of columns
df.loc[:, cols].apply(lambda x: print (x))
  • Related