Very simply put:
'_'.join(['ACT_X','IEC']).split('_')
# ['ACT', 'X', 'IEC']
From 2 strings that I joined, by the supposedly reverse operation I get 3 strings.
I can see why this is happening, and I get that perhaps the point of join
and split
is not to handle text that contains the separator itself.
However, in other python
code this is correctly handled, e.g. if I write out a csv, and some text contains commas, it is quoted, so the correct columns are read back in.
Can you think of a way to do this (join a list of strings by a separator) so that if any of the joined strings contains the separator, the joined result is fully reversible to the original strings?
In fact I was surprised by the above, because I thought the very point of join
was to do something different and more sophisticated than a simple
between strings (that is the case in other programming languages).
Now I am wondering what the difference actually is...
CodePudding user response:
Joining them with a character that never appears in the strings is the easiest way out. Depending on your data, suitable characters for this purpose might include "\n"
, "\t"
or "\xf1"
.
CodePudding user response:
Another solution is to use built-in csv
module:
import csv
from io import StringIO
lst = ["ACT_X", "IEC"]
s = StringIO()
writer = csv.writer(s, delimiter="_")
writer.writerow(lst)
s.seek(0)
row = s.read()
print(row)
Prints:
"ACT_X"_IEC
To read it back to list:
reader = csv.reader(StringIO(row), delimiter="_")
print(next(reader))
Prints:
['ACT_X', 'IEC']