My data looks like this:
First_Name Middle_Name Last_Name Full_Name (Header)
John R Rovin John R Rovin
Marano Justine Marano nan Justine
David Rose David nan Rose
nan nan nan
Robert Robert nan nan
I am trying to trim nan from Full_Name column to just get whatever name it possibly contains as is. When I am trying to use Fillna(' ') , its not helping me to trim but its completely removing the column content. My final DF should look something like this:
First_Name Middle_Name Last_Name Full_Name (Header)
John R Rovin John R Rovin
Marano Justine Marano Justine
David Rose David Rose
Robert Robert
I am heavily dependent for most of the operations on Pandas. So is there any way using pandas I can solve this problem?
CodePudding user response:
You can use:
cols = ['First_Name', 'Middle_Name', 'Last_Name']
df['Full_Name'] = df[cols].apply(lambda x: ' '.join(i for i in x if pd.notna(i)), axis=1)
print(df)
# Output
First_Name Middle_Name Last_Name Full_Name
0 John R Rovin John R Rovin
1 Marano NaN Justine Marano Justine
2 David NaN Rose David Rose
3 NaN NaN NaN
4 Robert NaN NaN Robert
Setup a MRE
import pandas as pd
import numpy as np
data = {'First_Name': ['John', 'Marano', 'David', np.nan, 'Robert'],
'Middle_Name': ['R', np.nan, np.nan, np.nan, np.nan],
'Last_Name': ['Rovin', 'Justine', 'Rose', np.nan, np.nan]}
df = pd.DataFrame(data)
CodePudding user response:
if nan
always appear the same in Full_Name
column you can use this:
df['Full_Name (Header)'] = df['Full_Name (Header)'].str.replace('nan', ' ')
CodePudding user response:
Use Series.dropna
for remove misisng values before join
:
cols = ['First_Name', 'Middle_Name', 'Last_Name']
df['Full_Name'] = df[cols].apply(lambda x: ' '.join(x.dropna()), axis=1)
print(df)
First_Name Middle_Name Last_Name Full_Name
0 John R Rovin John R Rovin
1 Marano NaN Justine Marano Justine
2 David NaN Rose David Rose
3 NaN NaN NaN
4 Robert NaN NaN Robert