How to return True/False instead of 1/0 in python-CodePudding

I am trying to evaluate two floating point values in a loop and for some reason the evaluation returns 1/0 instead of True/False.

def new_row(item1, item2):
    new_row = {
    'lister': item1,
    'metric': item2
    }
    return new_row

final_df = pd.DataFrame()
lister = ['a', 'b', 'c']
position = [1.1, 2.3, 4.5]
evaluation_metric = [0, 0.5, 0.2]
    for b1 in lister:
        print(abs(position) > evaluation_metric)
        metric = (abs(position) > evaluation_metric)
        nr = new_row(lister, metric)
        final_df = final_df.append(nr, ignore_index=True)

For some reason when I print I get True but when I append it to the final df I get 1.0. Any thoughts on how to get True in the final_df instead of 1.0?

CodePudding user response：

You created a dataframe without columns so pandas had to guess what to do when a row was appended. In a similar experiment, it chose float64:

>>> import pandas as pd
>>> df = pd.DataFrame()
>>> final = df.append({"lister":"a", "metric":False}, ignore_index=True)
>>> final
  lister  metric
0      a     0.0
>>> final.dtypes
lister     object
metric    float64
dtype: object

You could fix the dtype after you've done the appends

>>> final["metric"] = final["metric"].astype(bool)
>>> final
  lister  metric
0      a   False

But you likely shouldn't be appending in the first place. pandas lets you perform operations on entire columns. Create columns from your lists first, then do the operation in a single step, as in

import pandas as pd

lister = ['a', 'b', 'c']
position = [1.1, 2.3, 4.5]
evaluation_metric = [0, 0.5, 0.2]

df = pd.DataFrame({"lister":lister, "position":position, 
    "evaluation_metric":evaluation_metric})

df["metric"] = df["position"] > df["evaluation_metric"]

print(df)

Output

  lister  position  evaluation_metric  metric
0      a       1.1                0.0    True
1      b       2.3                0.5    True
2      c       4.5                0.2    True

If you don't need those other columns any more, you can drop them

df.drop(["position", "evaluation_metric"], axis=1, inplace=True)

CodePudding user response：

Although the code that you posted won't run (and it would be advised you fix this so your question isn't closed), the issue is that appending rows with type bool to an empty DataFrame will lead to conversion to float64 (according to this answer):

For example:

for l,p,e in zip(lister,position,evaluation_metric):
    metric = (abs(p) > e)
    nr = new_row(l, metric)
    final_df = final_df.append(nr, ignore_index=True)

>>> final_df.dtypes
lister     object
metric    float64

You can fix this by modifying your new_row function to return a DataFrame, then concatenating this your to final_df in each loop iteration:

def new_row(item1, item2):
    new_row = {
    'lister': [item1],
    'metric': [item2]
    }
    return pd.DataFrame(new_row)

final_df = pd.DataFrame()
lister = ['a', 'b', 'c']
position = [1.1, 2.3, 4.5]
evaluation_metric = [0, 0.5, 0.2]

for l,p,e in zip(lister,position,evaluation_metric):
    metric = (abs(p) > e)
    nr = new_row(l, metric)
    final_df = pd.concat([final_df,nr])

Output:

>>> final_df
  lister  metric
0      a    True
0      b    True
0      c    True