While trying to figure out how the bulk copy utility works so it can be applied to a large dataset, I discovered an issue - the first row of data does not get loaded into the database table. Here is the simple example I'm using to learn how bcp
works -
DDL:
use sandbox;
drop table dbo.people;
create table dbo.people
(
PersonID smallint NOT NULL,
FullName varchar(60) NOT NULL,
PerformanceRating varchar(1) NOT NULL
);
PeopleTableFormat.xml:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR=","" MAX_LENGTH="7"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=""," MAX_LENGTH="60" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="PersonID" xsi:type="SQLSMALLINT"/>
<COLUMN SOURCE="2" NAME="FullName" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="3" NAME="PerformanceRating" xsi:type="SQLVARYCHAR"/>
</ROW>
</BCPFORMAT>
LoadPeopleDataFromCsvFile.sh:
#!/bin/bash
bcp dbo.people in ./data.csv -S MyServerName -E -d sandbox -T -c -t "," -F 1 -e errors.csv -f ./PeopleTableFormat.xml
data.csv:
PersonID,FullName,PerformanceRating
107,"Brown, George",A
128,"White, Amanda",B
133,"Green, Greg",B
The data loading process is triggered by calling the shell script from Git Bash. The terminal output is this:
Here is a screenshot of the data that loaded:
Here is what errors.csv contains:
I'm not finding this error message helpful. Why isn't the first row of data being loaded?
CodePudding user response:
To correct the issue of the first data row not being imported, the column header row in data.csv had to be updated to match the terminators specified in the XML format file:
PersonID,"FullName",PerformanceRating
To correct 2 console errors the argument supplied for the -F
switch in LoadPeopleDataFromCsvFile.sh had to be changed to import data starting on row 2 of the CSV file and the -c
switch had to be removed:
bcp dbo.people in ./data.csv -S MyServerName -E -d sandbox -T -t "," -F 2 -e errors.csv -f ./PeopleTableFormat.xml