I have a data frame with around 10k Observation and few features. Manually in excel file I can create a new column based on given data i.e.
= IF(A = 0, 1, B)
where A, B are basically Column name.
Output of above formula would be appended on new column i.e. C.
I have used below Python code :
C = []
for x in df['A']:
if x == 0:
C.append(1)
else:
C.append(df['B'])
But the above output is giving me "unhashable type: 'Series'" Error.
Anyway to correct the above code
CodePudding user response:
Don't use a loop, use np.where
:
import numpy as np
# as new column
df['C'] = np.where(df['A'].eq(0), 1, df['B'])
# as list
C = np.where(df['A'].eq(0), 1, df['B']).tolist()
or mask
:
# as new column
df['C'] = df['B'].mask(df['A'].eq(0), 1)
# as list
C = df['B'].mask(df['A'].eq(0), 1).to_list()
CodePudding user response:
You are trying to append the whole column - you need to append the value. Try:
c = []
for a, b in zip(df['A'], df['B']):
c.append(1 if a == 0 else b)