Home > Mobile >  Concatenate strings only if they aren't NA values
Concatenate strings only if they aren't NA values

Time:04-06

I'm trying to concatunate 4 string variables in a pandas dataframe. The dataframe is something like this:

Morada 1                            Morada 2        Localidade               Postal Code
RUA DOS QUATRO CAMINHOS VEREDA 2    N 14 4DTO SUL   CANIDELO                 4400-501
RUA DOS QUATRO CAMINHOS VEREDA 2    N 14 4DTO SUL   CANIDELO                 4400-501
Rua chieiras n543 2.9B                              Vila Nova de Gaia        4400-415
RUA BOMBEIROS VOLUNTÁRIOS DECOIM    BRÕES 202 CASA3 Vila Nova de Gaia        4400-057
R DA BELGICA 2387                                   Vila Nova de Gaia        4400-053

I'm trying to get to something like this, for instance, for the first line of varibles

RUA DOS QUATRO CAMINHOS VEREDA 2 N 14 4DTO SUL CANIDELO 4400-501

On the other hand, for line 3, where the variable 'Morada 2' is missing, the goal is to get this:

Rua chieiras n543 2.9B Vila Nova de Gaia 4400-415

Within the data frame, each one of this 4 variables could be missing, e.g., could be a NA value. My goal is to concatenate the strings by ignoring the ones that are NA.

I was wondering if there is a way of doing this that doesn't involve going through all the possible if conditions of missing values.

Thanks!

CodePudding user response:

You could combine the DataFrame using a lambda like my example below:

df['combined'] = df.apply(lambda row: ' '.join(row.dropna().astype(str)), axis=1)

This will create a new column for you with combined strings joined, ignoring missing values.

  • Related