df [df ['Some Column'] ==5]
If we take a look at the conditional in this statement, we have an equality comparator between df ['Some Column']
and 5
. Typically, this should return False
. But with Pandas, this returns a Series
of the dataframe. How is this magic done behind the scenes? As you would expect a Series Object is not equal to 5, returning False
. How does Pandas or Python ensure that False is not returned?
CodePudding user response:
Via inheritence, pandas.Series.__eq__
is:
class OpsMixin:
...
@unpack_zerodim_and_defer("__eq__")
def __eq__(self, other):
return self._cmp_method(other, operator.eq)
class Series(base.IndexOpsMixin, generic.NDFrame):
...
def _cmp_method(self, other, op):
res_name = ops.get_op_result_name(self, other)
if isinstance(other, Series) and not self._indexed_same(other):
raise ValueError("Can only compare identically-labeled Series objects")
lvalues = self._values
rvalues = extract_array(other, extract_numpy=True, extract_range=True)
with np.errstate(all="ignore"):
res_values = ops.comparison_op(lvalues, rvalues, op)
return self._construct_result(res_values, name=res_name)
So when you do:
pd.Series([1,2,3,4]) == 2
What is really returned is the result of:
import operator
import numpy as np
operator.eq(np.array([1,2,3,4]), 2)
So maybe your question should really be how do numpy arrays check for equality?