Just out of curiosity, what is the difference between df**x
and df.pow(x)
?
Having a dataframe df
with a column named values
you can either do: df.values ** 2
or df.values.pow(2)
to compute the entire column to the power of 2. I understand that you can change the axis while using DataFrame.pow
. But is there a difference in performance? Will changing the power influence the performance?
df = pd.DataFrame([1.,2])
df**2
df.pow(2)
I have read the discussion between the difference between x**y
and x.pow(y)
from the math
-module here
CodePudding user response:
From the pandas docs:
Equivalent to dataframe ** other, but with support to substitute a fill_value for missing data in one of the inputs. With reverse version, rpow.
It seems that performance wise they are nearly equivalent, but pow
allows you to change the axis and add a fill_value
which replaces missing values. I'd imagine that there is an extremely slight performance cost to using pow
, but if a performance difference that granular matters, maybe python and pandas are the wrong tools for your project.