Home > Software engineering >  python join followed by split does not return the original list
python join followed by split does not return the original list

Time:08-27

Very simply put:

'_'.join(['ACT_X','IEC']).split('_')
# ['ACT', 'X', 'IEC']

From 2 strings that I joined, by the supposedly reverse operation I get 3 strings.

I can see why this is happening, and I get that perhaps the point of join and split is not to handle text that contains the separator itself.
However, in other python code this is correctly handled, e.g. if I write out a csv, and some text contains commas, it is quoted, so the correct columns are read back in.

Can you think of a way to do this (join a list of strings by a separator) so that if any of the joined strings contains the separator, the joined result is fully reversible to the original strings?

In fact I was surprised by the above, because I thought the very point of join was to do something different and more sophisticated than a simple between strings (that is the case in other programming languages).
Now I am wondering what the difference actually is...

CodePudding user response:

Joining them with a character that never appears in the strings is the easiest way out. Depending on your data, suitable characters for this purpose might include "\n", "\t" or "\xf1".

CodePudding user response:

Another solution is to use built-in csv module:

import csv
from io import StringIO

lst = ["ACT_X", "IEC"]

s = StringIO()
writer = csv.writer(s, delimiter="_")
writer.writerow(lst)
s.seek(0)

row = s.read()
print(row)

Prints:

"ACT_X"_IEC

To read it back to list:

reader = csv.reader(StringIO(row), delimiter="_")
print(next(reader))

Prints:

['ACT_X', 'IEC']
  • Related