manage couple of string slices in Pandas, reverting the order-CodePudding

I have to manage some strange strings: my aim is taking each couple of slices and adding to the string the "mirror" version of the slice (please notice that it's not a "reverse" of the string, e.g. IB is maintained as IB, not BI).

I was thinking about something better then splitting-substrings, like regex, slicing, would it be possible?


data_raw = ['AM - IB - XY - ZW','CD - TT - WS - QA - CZ - MN']

desired_raw = ['AM - IB - IB - AM - XY - ZW - ZW - XY',
               'CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN -CZ']

data_raw = pd.DataFrame(data_raw, columns = ['name'])

                          name
0            AM - IB - XY - ZW
1  CD - TT - WS - QA - CZ - MN
 

desired_raw = pd.DataFrame(desired_raw, columns = ['name'])

                                                       name
0                     AM - IB - IB - AM - XY - ZW - ZW - XY
1  CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN -CZ

CodePudding user response：

IIUC, you can use a regex:

data_raw['name'].str.replace(r'([^-\s] ) - ([^-\s] )',
                             r'\1 - \2 - \2 - \1', regex=True)

variant:

data_raw['name'].str.replace(r'(([^-\s] ) - ([^-\s] ))',
                             r'\1 - \3 - \2', regex=True)

output:

0                     AM - IB - IB - AM - XY - ZW - ZW - XY
1  CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN -CZ
Name: name, dtype: object

regex demo

CodePudding user response：

Another solution, not using regex:

data_raw["name"] = (
    data_raw["name"]
    .str.split(" - ")
    .apply(
        lambda x: " - ".join(
            f"{a} - {b} - {b} - {a}" for a, b in zip(x[::2], x[1::2])
        )
    )
)
print(data_raw)

Prints:

                                                        name
0                      AM - IB - IB - AM - XY - ZW - ZW - XY
1  CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN - CZ