I'm trying to concatunate 4 string variables in a pandas dataframe. The dataframe is something like this:
Morada 1 Morada 2 Localidade Postal Code
RUA DOS QUATRO CAMINHOS VEREDA 2 N 14 4DTO SUL CANIDELO 4400-501
RUA DOS QUATRO CAMINHOS VEREDA 2 N 14 4DTO SUL CANIDELO 4400-501
Rua chieiras n543 2.9B Vila Nova de Gaia 4400-415
RUA BOMBEIROS VOLUNTÁRIOS DECOIM BRÕES 202 CASA3 Vila Nova de Gaia 4400-057
R DA BELGICA 2387 Vila Nova de Gaia 4400-053
I'm trying to get to something like this, for instance, for the first line of varibles
RUA DOS QUATRO CAMINHOS VEREDA 2 N 14 4DTO SUL CANIDELO 4400-501
On the other hand, for line 3, where the variable 'Morada 2' is missing, the goal is to get this:
Rua chieiras n543 2.9B Vila Nova de Gaia 4400-415
Within the data frame, each one of this 4 variables could be missing, e.g., could be a NA value. My goal is to concatenate the strings by ignoring the ones that are NA.
I was wondering if there is a way of doing this that doesn't involve going through all the possible if conditions of missing values.
Thanks!
CodePudding user response:
You could combine the DataFrame
using a lambda like my example below:
df['combined'] = df.apply(lambda row: ' '.join(row.dropna().astype(str)), axis=1)
This will create a new column for you with combined strings joined, ignoring missing values.