Home > Enterprise >  Split string if separator is not in-between two characters
Split string if separator is not in-between two characters

Time:07-22

I want to write a script that reads from a csv file and splits each line by comma except any commas in-between two specific characters.

In the below code snippet I would like to split line by commas except the commas in-between two $s.

line = "$abc,def$,$ghi$,$jkl,mno$"

output = line.split(',')

for o in output:
   print(o)

How do I write output = line.split(',') so that I get the following terminal output?

~$ python script.py
$abc,def$
$ghi$
$jkl,mno$

CodePudding user response:

You can do this with a regular expression:

In re, the (?<!\$) will match a character not immediately following a $.

Similarly, a (?!\$) will match a character not immediately before a dollar.

The | character cam match multiple options. So to match a character where either side is not a $ you can use:

expression = r"(?<!\$),|,(?!\$)"

Full program:

import re
expression = r"(?<!\$),|,(?!\$)"
print(re.split(expression, "$abc,def$,$ghi$,$jkl,mno$"))

CodePudding user response:

One solution (maybe not the most elegant but it will work) is to replace the string $,$ with something like $,,$ and then split ,,. So something like this

output = line.replace('$,$','$,,$').split(',,')

Using regex like mousetail suggested is the more elegant and robust solution but requires knowing regex (not that anyone KNOWS regex)

CodePudding user response:

Try regular expressions:

import re

line = "$abc,def$,$ghi$,$jkl,mno$"

output = re.findall(r"\$(.*?)\$", line)

for o in output:
    print('$' o '$')
$abc,def$
$ghi$
$jkl,mno$

CodePudding user response:

First, you can identify a character that is not used in that line:

c = chr(max(map(ord, line))   1)

Then, you can proceed as follows:

line.replace('$,$', f'${c}$').split(c)

Here is your example:

>>> line = '$abc,def$,$ghi$,$jkl,mno$'
>>> c = chr(max(map(ord, line))   1)
>>> result = line.replace('$,$', f'${c}$').split(c)
>>> print(*result, sep='\n')
$abc,def$
$ghi$
$jkl,mno$
  • Related