I'm trying to get values in column z that contains null values or integers:
df = pd.DataFrame({'X': [1, 2, 3, 4],
'Y': [2, 10, 13, 18],
'Z': [3, None, 5, None]})
a = df[(df.X == 1) & (df.Y == 2)].Z.item()
print(a)
#output: 3
b = df[(df.X == 7) & (df.Y == 18)].Z.item()
print(b)
#output: error
It throws a value error: can only convert an array of size 1 to a Python scalar. Because the data frame resulting from filtering the X and Y columns is empty. I want to assign variable b to None
if the data frame is empty.
I tried the following, and it works:
#checking the length of the dataframe
b = df[(df.X == 1) & (df.Y == 2)].Z.item() if (len(df[(df.X == 7) & (df.Y == 18)]) == 1) else None
print(b)
# output: None
Is there a better way to do it?
CodePudding user response:
One alternative is to use next(..., None)
, which returns None if the iterator is empty:
b = next(iter(df[(df.X == 7) & (df.Y == 18)].Z), None)
print(b)
# None
CodePudding user response:
2 things - your result is empty, hence error - .item()
apparently throws an error on empty pd.Series
.
Secondly - the more canonical way of achieving what you're after would be:
>>> b = df.loc[(df.X == 7) & (df.Y == 18), "Z"].values
>>> if len(b) == 0: b=None
...
Also prevents error .item()
generates.