I have a file containing lines in the form of:
Product, Brand, Cost, Comment, Location, Rating
For example, the file is like:
Computer, "Apple, Lenovo", 500, "Can be fixed", Berlin, 5
Computer, "Apple, Lenovo", 670, "Can be fixed, need new screen", Berlin, 3
Phone, "Samsung, Unknown", 670,"Touch is dead, Screen is dead", Berlin, 1
TV, "Samsung, Unknown", 670,"Antenna dead, light issues", Frankfurt, 2
I want to change the separating "," to a new character "|" For example, I want this:
Computer| "Apple, Lenovo"| 500| "Can be fixed"| Berlin| 5
Computer| "Apple, Lenovo"| 670| "Can be fixed, need new screen"| Berlin| 3
Phone| "Samsung, Unknown"| 670| "Touch is dead, Screen is dead"| Berlin| 1
TV| "Samsung, Unknown"| 670| "Antenna dead, light issues"| Frankfurt| 2
I've tried using sed command such as sed 's/,/|/g' prod.csv > output.csv
But it also changes the commas between quotes ! I'd like to change every commas who are NOT between quotes and only them.
I'd like to insist on using general tools like sed or tr, grep, cut etc and not csv tools like csvkit because I will not work with csv files later on.
Is it possible to change my command to achieve this?
I'm looking for something like prod.csv | (filter out the commas between quotes) | tr "," "|"
See what I mean?
I am very bad at this, thank you very much in advance for your help
Best,
CodePudding user response:
If you MUST use sed
, then this fragile solution which works for your example data but may not work with real life data could possibly help.?.
$ sed 's/,/|/;s/,/|/;s/,/|/;s/\(.*"\).\([^,]*\).\(.*\)/\1|\2|\3/' input_file
Computer| "Apple"| 500| "Can be fixed"| Berlin| 5
Computer| "Apple"| 670| "Can be fixed, need new screen"| Berlin| 3
Phone| "Samsung"| 670|"Touch is dead, Screen is dead"| Berlin| 1
TV| "Samsung"| 670|"Antenna dead, light issues"| Frankfurt| 2
CodePudding user response:
This might work for you (GNU sed):
sed -E ':a;s/^([^"]*("[^",]*")*[^"]*"[^,"]*),/\1\n/;ta;y/,\n/|,/' file
Convert ,
's within double quotes to newlines, then translate ,
's to |
's and \n
's to ,
's.