Python insert space between numbers and characters in a column-CodePudding

I have a dataframe with multiple columns and I want to separate the numbers from the letters with a space in one column. In this example I want to add space in the third column. do you know how to do so?

import pandas as pd

data = {'first_column':  ['first_value', 'second_value', 'third_value'],
        'second_column': ['first_value', 'second_value', 'third_value'],
         'third_column':['AA6589', 'GG6589', 'BXV6589'],
        'fourth_column':['first_value', 'second_value', 'third_value'],
        }

df = pd.DataFrame(data)

print (df)

CodePudding user response：

Use str.replace with a short regex:

df['third_column'] = df['third_column'].str.replace(r'(\D )(\d )',
                                                    r'\1 \2', regex=True)

regex:

(\D )   # capture one or more non-digits
(\d )   # capture one or more digits

replace with \1 \2 (first captured group, then space, then second captured group).

Alternative with lookarounds:

df['third_column'] = df['third_column'].str.replace(r'(?<=\D)(?=\d)',
                                                    ' ', regex=True)

meaning: insert a space at any position in-between a non-digit and a digit.

CodePudding user response：

Similarly you could extract the digits and non digit characters from your 'third_column' and place them together with a space in between:

df.assign(
    third_column=df["third_column"].str.extract(r'(\D )')   " "   df["third_column"].str.extract(r'(\d )')
    )

   first_column second_column third_column fourth_column
0   first_value   first_value      AA 6589   first_value
1  second_value  second_value      GG 6589  second_value
2   third_value   third_value     BXV 6589   third_value