I am trying to load data into a table from a CSV file that I have dumped data into. The problem is that the CSV file is delimited using | and some of the data has the | character included. Thus, I get an error when loading because some the extra | character is read as an extra column.
What I would like to do is create a SQL query that will recognize which rows of data have this | character inside them and then put a quote around it. So instead of a row containing A|B (which is read as data in two columns), we have "A|B".
I was originally going to just create a query where all columns' data have quotes around them, but the issue is I have millions of rows and only about 50 of them have this issue of where the delimiting character is part of the data in that column. Rather than adding quotes around all the data, I would like to be able to add quotes only around the problematic data.
Is there an efficient/easy way to accomplish this? Let me know if more details are needed.
CodePudding user response:
I see two options.
- change delimiter; it is usually a bad idea to use character that can be found within data
- enclose values into double quotes (as you already said). You can do that for all columns, or only for those that cause problems
If you use SQL Loader to load data into the database (if you don't, I'd suggest you to), one of its options is optionally enclosed by
and you'd use double quotes with it. Another option which contains that option is external table (which uses SQL Loader in the background anyway).