I would like to read a csv file but the separator sometimes appear in the second column (json). Is it possible to escape pipe when it appears inside quotes ?
from io import StringIO
import pandas as pd
data = 'col1|{"a":"1","b":"2|3","c":"4"}'
df = pd.read_csv(
StringIO(data),
header=None,
sep='|',
quoting=csv.QUOTE_NONE,
quotechar='"',
doublequote=False
)
Current
0 | 1 | 2 |
---|---|---|
col1 | {"a":"1","b":"2 | 3","c":"4"} |
Expected
0 | 1 |
---|---|
col1 | {"a":"1","b":"2|3","c":"4"} |
CodePudding user response:
Try this:
data = """col1|'{"a":"1","b":"2|3","c":"4"}'"""
df = pd.read_csv(
StringIO(data),
header=None,
sep='|',
quotechar="'"
)
pandas can recognize the value as a whole string if the value is enclosed in the specified quotechar
, for that you need to surround the json like string in ''
.
Also I triple-quoted the data
string to preserve the single quotes.
You can also write a simple file with the same string in a csv and try to read_csv with quotechar = "'"