How can I make this python code more efficient?-CodePudding

I realize this is an incredibly inefficient way to code this, so I'm hoping someone will have suggestions on a more efficient method.

Essentially I'm trying to create a column ("freq") with values of 0 for NA and "Nothing" objects and 1 otherwise. Sample df:

i   obj           freq

0.  Nothing        0
1.  Something      1
2.  NaN            0
3.  Something      1


for i in range(0,len(df)):
  if str(df["obj"].iloc[i]) == "Nothing" or str(df["obj"].iloc[i]) == NaN:
    d["freq"].iloc[i] = 0
  else:
    df["freq"].iloc[i] = 1

CodePudding user response：

You can use np.where()

import pandas as pd 
import numpy as np

df = pd.DataFrame({'obj': {0: 'Nothing', 1: 'Something', 2: np.nan, 3: 'Something'}})

df['freq'] = np.where((df['obj'] == 'Nothing') | (df['obj'].isnull()), 0, 1)

CodePudding user response：

Without a dataframe is hard to check if works, but it should

indexer = (df['obj'] == 'Nothing') | (df['obj'].astype(str) == 'NaN')
df.loc[indexer, 'freq'] = 0
df.loc[~indexer, 'freq'] = 1