Home > Mobile >  Python Pandas throws error when taking in variable but not value
Python Pandas throws error when taking in variable but not value

Time:06-14

import pandas as pd

df = pd.read_csv("stocks.csv")
df = df.where(pd.notnull(df), None)
df["date"] = pd.to_datetime(df["date"])
m = df.loc[df['market'] == "NASDAQ"].max(numeric_only=True)
m

This returns the following data

price 99614.04 dtype: float64

now when I try to use the variable 'm' I receive the following error

import pandas as pd

df = pd.read_csv("stocks.csv")
df = df.where(pd.notnull(df), None)
df["date"] = pd.to_datetime(df["date"])
m = df.loc[df['market'] == "NASDAQ"].max(numeric_only=True)
df.loc[(df['market'] == "NASDAQ") & (df["price"] == m)]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_62032/1757287628.py in <module>
      5 df["date"] = pd.to_datetime(df["date"])
      6 m = df.loc[df['market'] == "NASDAQ"].max(numeric_only=True)
----> 7 df.loc[(df['market'] == "NASDAQ") & (df["price"] == m)]

~\Anaconda3\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     67         other = item_from_zerodim(other)
     68 
---> 69         return method(self, other)
     70 
     71     return new_method

~\Anaconda3\lib\site-packages\pandas\core\arraylike.py in __eq__(self, other)
     30     @unpack_zerodim_and_defer("__eq__")
     31     def __eq__(self, other):
---> 32         return self._cmp_method(other, operator.eq)
     33 
     34     @unpack_zerodim_and_defer("__ne__")

~\Anaconda3\lib\site-packages\pandas\core\series.py in _cmp_method(self, other, op)
   5494 
   5495         if isinstance(other, Series) and not self._indexed_same(other):
-> 5496             raise ValueError("Can only compare identically-labeled Series objects")
   5497 
   5498         lvalues = self._values

ValueError: Can only compare identically-labeled Series objects

But when I use the actual value for 'm' it works.

import pandas as pd

df = pd.read_csv("stocks.csv")
df = df.where(pd.notnull(df), None)
df["date"] = pd.to_datetime(df["date"])
m = df.loc[df['market'] == "NASDAQ"].max(numeric_only=True)
df.loc[(df['market'] == "NASDAQ") & (df["price"] == 99614.04)]
id  name    price   symbol  industry    market  currency    date
25  1abf2ffc-3396-4ed9-954d-956be97668c0    Brocade Communications Systems, Inc.    99614.04    BRCD    Computer Communications Equipment   NASDAQ  PLN 2020-09-12

Could someone please explain why this interaction is playing out this way?

CodePudding user response:

Return value is a Series, you can use

m = df.loc[df['market'] == "NASDAQ", 'price'].max(numeric_only=True)
# or
m = df.loc[df['market'] == "NASDAQ"].max(numeric_only=True).item()

CodePudding user response:

Use the following instead, currently you're returning a Series because you're not specifying from which Column you want to take the max.

m = df[df['market'].eq("NASDAQ")]['price'].max()
  • Related