I have a dataframe with multiple columns, two of these columns have the same name ('mobilephone'), some values are empty, and some aren't but there will always be one of the two columns populated with a phone number:
mobilephone mobilephone
0 999000111 999000111
1 999000222
2 999000333
3 999000444 999000444
How would I keep only one of these columns but populate the empty values in the first column with the values in the second column?
CodePudding user response:
First, find the index of the second duplicate column. Then change the name of this column and use .loc[]
to fill in the gaps:
col_name = 'mobilephone'
index_second_column = np.where(df.columns.duplicated())[0][0]
df.columns.values[index_second_column] = '_1'
df.loc[df[col_name].isnull(), colname] = df[col_name '_1']
df.drop(columns = col_name '_1', inplace=True)
CodePudding user response:
Your Problem can easily solved using below code:
# renaming the columns
df.columns = ['mobilephone1', 'mobilephone2']
# filling blank cell with second column data
df['mobilephone1']=df['mobilephone1'].fillna(df['mobilephone2'])
# if you want you canrenam the columns again
df.columns = ['mobilephone', 'mobilephone']
Let me know if this helps you