I want to create a new column and place in it a value from column2 if it's value is not NaN or otherwise value from column1
Initial table:
column1 | column2 |
---|---|
name | otherName1 |
name1 | NaN |
name2 | otherName2 |
name3 | otherName3 |
name3 | NaN |
New Table
column1 | column2 | NEW column |
---|---|---|
name | otherName1 | otherName1 |
name1 | NaN | name1 |
name2 | otherName2 | otherName2 |
name3 | otherName3 | otherName3 |
name3 | NaN | name3 |
My code is too slow and believe that there is a faster solution
for index,row in df.iterrows():
if pd.notnull(row['column2']):
row['NEW column'] = row['column2']
else:
row['NEW column'] = row['column1']
CodePudding user response:
You can use fillna
to fill in the NaN values in column2
and assign it to a new column:
df['NEW column'] = df['column2'].fillna(df['column1'])
Output:
column1 column2 NEW column
0 name otherName1 otherName1
1 name1 NaN name1
2 name2 otherName2 otherName2
3 name3 otherName3 otherName3
4 name3 NaN name3
CodePudding user response:
You can use lambda function to apply
:
df['new column'] = df.apply(lambda row: row['column2'] if pd.notnull(row['column2'] else row['column1'], axis=1)
CodePudding user response:
df['NEW column'] = df.column2
mask = df.column2.isna()
df.loc[mask, 'NEW column'] = df.column1[mask]
CodePudding user response:
Remember, always try to avoid iterating through a dataframe if possible! In your case you can use something like df['new_column'] = np.where(df['column1'].isna(), df['column2'], df['column1']
). np.where
functions like an "if-else" statement as you can see in the docs.
Edit: just saw enke's answer, that's a better approach!