Home > Software design >  Apply function to multiple row pandas
Apply function to multiple row pandas

Time:04-30

Suppose I have a dataframe like this

               0         5         10        15        20        25      ...
action_0_Q0  0.299098  0.093973  0.761735  0.058112  0.013463  0.164322  ... 
action_0_Q1  0.463095  0.468425  0.202679  0.742424  0.865005  0.479546  ... 
action_0_Q2  0.237807  0.437602  0.035587  0.199465  0.121532  0.356132  ... 
action_1_Q0  0.263191  0.176407  0.471295  0.082457  0.029566  0.426428  ... 
action_1_Q1  0.508573  0.490355  0.431732  0.249432  0.189732  0.396947  ... 
action_1_Q2  0.228236  0.333238  0.096973  0.668111  0.780702  0.176625  ... 
action_2_Q0  0.256632  0.122589  0.495720  0.059918  0.824424  0.384998  ... 
action_2_Q1  0.485362  0.462969  0.420790  0.211578  0.155771  0.186493  ... 
action_2_Q2  0.258006  0.414442  0.083490  0.728504  0.019805  0.428509  ...

This dataframe may be very large (a lot of rows, about 3000 columns). What I have to do is to apply a function to each column, which in turn returns a distance matrix. However, such function should be applied by considering 3 rows at once. For example, taking the first column:

a = distance_function([[0.299098, 0.463095, 0.237807], [0.263191, 0.508573, 0.228236], [0.256632, 0.485362, 0.258006]])

# Returns

print(a.shape) -> (3,3)

Now, this is not overly complicated via a for loop, but the time required would be huge. Is there some alternative way?

CodePudding user response:

IIUC use:

df = df.apply(lambda x: distance_function(x.to_numpy().reshape(-1,3)))

If need flatten values:

from itertools import chain

df = df.apply(lambda x: list(chain.from_iterable(distance_function(x.to_numpy().reshape(-1,3))))
  • Related