I want to write a script that reads from a csv file and splits each line by comma except any commas in-between two specific characters.
In the below code snippet I would like to split line by commas except the commas in-between two $
s.
line = "$abc,def$,$ghi$,$jkl,mno$"
output = line.split(',')
for o in output:
print(o)
How do I write output = line.split(',')
so that I get the following terminal output?
~$ python script.py
$abc,def$
$ghi$
$jkl,mno$
CodePudding user response:
You can do this with a regular expression:
In re, the (?<!\$)
will match a character not immediately following a $
.
Similarly, a (?!\$)
will match a character not immediately before a dollar.
The |
character cam match multiple options. So to match a character where either side is not a $
you can use:
expression = r"(?<!\$),|,(?!\$)"
Full program:
import re
expression = r"(?<!\$),|,(?!\$)"
print(re.split(expression, "$abc,def$,$ghi$,$jkl,mno$"))
CodePudding user response:
One solution (maybe not the most elegant but it will work) is to replace the string $,$
with something like $,,$
and then split ,,
. So something like this
output = line.replace('$,$','$,,$').split(',,')
Using regex like mousetail suggested is the more elegant and robust solution but requires knowing regex (not that anyone KNOWS regex)
CodePudding user response:
Try regular expressions:
import re
line = "$abc,def$,$ghi$,$jkl,mno$"
output = re.findall(r"\$(.*?)\$", line)
for o in output:
print('$' o '$')
$abc,def$
$ghi$
$jkl,mno$
CodePudding user response:
First, you can identify a character that is not used in that line:
c = chr(max(map(ord, line)) 1)
Then, you can proceed as follows:
line.replace('$,$', f'${c}$').split(c)
Here is your example:
>>> line = '$abc,def$,$ghi$,$jkl,mno$'
>>> c = chr(max(map(ord, line)) 1)
>>> result = line.replace('$,$', f'${c}$').split(c)
>>> print(*result, sep='\n')
$abc,def$
$ghi$
$jkl,mno$