When we use the index to select a specific row, we can access an element with .loc
, .iloc
, etc.:
df = pd.DataFrame([[1, 2], [3, 4]], columns=["col1", "col2"], index=["aa", "bb"])
x = df.loc["aa", "col2"]
print(x, type(x)) # 2 <class 'numpy.int64'>
But when our id
column is not the index, such as:
id col1 col2
0 aa 1 2
1 bb 3 4
what is the natural Pandas way to access the element of column col2
and of row of id
equal to aa
?
This doesn't work:
df = pd.DataFrame([["aa", 1, 2], ["bb", 3, 4]], columns=["id", "col1", "col2"])
x = df[df["id"] == "aa"]["col2"]
print(x, type(x))
# 0 2
# Name: col2, dtype: int64 <class 'pandas.core.series.Series'>
because it outputs a Series
and not a number as expected. Is there a more standard way than adding an extra [0]
:
x = df[df["id"] == "aa"]["col2"][0] # 2, as expected
?
TL;DR: Why is df[df["id"] == "aa"]["col2"]
a Series
and not an element?
CodePudding user response:
If you need to return a scalar after applying a boolean mask, you can use pandas.Series.values
:
x = df.loc[df['id'] == 'aa', 'col2'].values[0]
print(x, type(x))
2 <class 'numpy.int64'> #output
CodePudding user response:
what is the natural Pandas way to access the element of column col2 and of row of id equal to aa?
Unfortunately not exist, need convert id
to index
first for avoid one element Series
, so need select by position [0]
.