Python Pandas add value to a new row if the other is NaN-CodePudding

I want to create a new column and place in it a value from column2 if it's value is not NaN or otherwise value from column1

Initial table:

column1	column2
name	otherName1
name1	NaN
name2	otherName2
name3	otherName3
name3	NaN

New Table

column1	column2	NEW column
name	otherName1	otherName1
name1	NaN	name1
name2	otherName2	otherName2
name3	otherName3	otherName3
name3	NaN	name3

My code is too slow and believe that there is a faster solution

for index,row in df.iterrows():
     if pd.notnull(row['column2']):
         row['NEW column'] = row['column2']
     else:
         row['NEW column'] = row['column1']

CodePudding user response：

You can use fillna to fill in the NaN values in column2 and assign it to a new column:

df['NEW column'] = df['column2'].fillna(df['column1'])

Output:

  column1     column2  NEW column
0    name  otherName1  otherName1
1   name1         NaN       name1
2   name2  otherName2  otherName2
3   name3  otherName3  otherName3
4   name3         NaN       name3

CodePudding user response：

You can use lambda function to apply:

df['new column'] = df.apply(lambda row: row['column2'] if pd.notnull(row['column2'] else row['column1'], axis=1)

CodePudding user response：

df['NEW column'] = df.column2
mask = df.column2.isna()
df.loc[mask, 'NEW column'] = df.column1[mask]

CodePudding user response：

Remember, always try to avoid iterating through a dataframe if possible! In your case you can use something like df['new_column'] = np.where(df['column1'].isna(), df['column2'], df['column1']). np.where functions like an "if-else" statement as you can see in the docs.

Edit: just saw enke's answer, that's a better approach!