I want to replace all occurencies of "
between ,"
and ",
with '''
(three singular quotes). It will be done on a csv file and on all possible nested quotes to not mess up formatting.
E.g.
"test","""","test"
becomes
"test","''''''","test"
.
Another example:
"test","quotes "inside" quotes","test"
becomes
"test","quotes '''inside''' quotes"
.
I use https://sed.js.org/ to test the replacement.
What I currently have is
sed "s/\([^,]\)\(\"\)\(.\)/\\1'\\''\\3/g"
but it seems not completed and it doesn't cover all cases that I want.
e.g.
works:
"anything","inside "quotes"","anything"
->
"anything","inside '''quotes'''","anything"
doesn't work for:
"anything","inside "test" quotes","anything"
->
"anything''',"inside '''test''' quotes''',"anything"
expected ->
"anything","inside '''test''' quotes","anything"
Maybe somebody is good with regex expressions and could help?
CodePudding user response:
Using sed
$ cat input_file
"test","""","test"
"test","quotes "inside" quotes","test"
"anything","inside "quotes"","anything"
"anything","inside "test" quotes","anything"
$ sed -E ':a;s/(,"[^,]*('"'"' )?)"([^,]*"(,|$))/\1'"'''"'\3/;ta' input_file
"test","''''''","test"
"test","quotes '''inside''' quotes","test"
"anything","inside '''quotes'''","anything"
"anything","inside '''test''' quotes","anything"
CodePudding user response:
Escaping the triple single quotes is avoided woth a variable ${qs}
.
Start replacing all quotes with ${qs}
.
Next reset the replacements at the start of line, end of line and around ,
.
qs="'''"
sed "s/\"/${qs}/g; s/^${qs}/\"/; s/${qs}$/\"/; s/${qs},${qs}/\",\"/g" csvfile