I have to manage some strange strings: my aim is taking each couple of slices and adding to the string the "mirror" version of the slice (please notice that it's not a "reverse" of the string, e.g. IB is maintained as IB, not BI).
I was thinking about something better then splitting-substrings, like regex, slicing, would it be possible?
data_raw = ['AM - IB - XY - ZW','CD - TT - WS - QA - CZ - MN']
desired_raw = ['AM - IB - IB - AM - XY - ZW - ZW - XY',
'CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN -CZ']
data_raw = pd.DataFrame(data_raw, columns = ['name'])
name
0 AM - IB - XY - ZW
1 CD - TT - WS - QA - CZ - MN
desired_raw = pd.DataFrame(desired_raw, columns = ['name'])
name
0 AM - IB - IB - AM - XY - ZW - ZW - XY
1 CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN -CZ
CodePudding user response:
IIUC, you can use a regex:
data_raw['name'].str.replace(r'([^-\s] ) - ([^-\s] )',
r'\1 - \2 - \2 - \1', regex=True)
variant:
data_raw['name'].str.replace(r'(([^-\s] ) - ([^-\s] ))',
r'\1 - \3 - \2', regex=True)
output:
0 AM - IB - IB - AM - XY - ZW - ZW - XY
1 CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN -CZ
Name: name, dtype: object
CodePudding user response:
Another solution, not using regex:
data_raw["name"] = (
data_raw["name"]
.str.split(" - ")
.apply(
lambda x: " - ".join(
f"{a} - {b} - {b} - {a}" for a, b in zip(x[::2], x[1::2])
)
)
)
print(data_raw)
Prints:
name
0 AM - IB - IB - AM - XY - ZW - ZW - XY
1 CD - TT - TT - CD - WS - QA - QA - WS - CZ - MN - MN - CZ