I realize this is an incredibly inefficient way to code this, so I'm hoping someone will have suggestions on a more efficient method.
Essentially I'm trying to create a column ("freq") with values of 0 for NA and "Nothing" objects and 1 otherwise. Sample df:
i obj freq
0. Nothing 0
1. Something 1
2. NaN 0
3. Something 1
for i in range(0,len(df)):
if str(df["obj"].iloc[i]) == "Nothing" or str(df["obj"].iloc[i]) == NaN:
d["freq"].iloc[i] = 0
else:
df["freq"].iloc[i] = 1
CodePudding user response:
You can use np.where()
import pandas as pd
import numpy as np
df = pd.DataFrame({'obj': {0: 'Nothing', 1: 'Something', 2: np.nan, 3: 'Something'}})
df['freq'] = np.where((df['obj'] == 'Nothing') | (df['obj'].isnull()), 0, 1)
CodePudding user response:
Without a dataframe is hard to check if works, but it should
indexer = (df['obj'] == 'Nothing') | (df['obj'].astype(str) == 'NaN')
df.loc[indexer, 'freq'] = 0
df.loc[~indexer, 'freq'] = 1