I have a delimited fields file that I want to load into the database in Linux. The thing is that the number of delimited fields is not the same in every row. So, I need a shell script to iterate over each line and check for the number of occurrences of the delimiter character, I need 13 occurrences of the delimiter character per line. So, if I have 10 for example, I need to add 2 extra delimiter characters at the end of this line.
Now all that I have got is this:
#!/usr/bin/bash
while read p; do
if
-----------
fi
done <myDataFile
CodePudding user response:
The information provided is rather sparse.
What's your delimiter?
Can the delimiter (quoted) occur in any of the fields?
For a simple case, e.g. delimiter="|" and doesn't occur inside the fields here's a quick awk
hack.
$ cat myDataFile
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|c|d|e|f|g|h|i|m
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|d|e|f|g|h|i|j|k|l|m
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|c|d|e|f|i|j|k|l|m
a|b|c|d|e|f|g|h|i|j|k|l|m
And the awk:
awk -F'|' '{missing=13-NF;if(missing==0){print $0}else{printf "%s",$0;for(i=1;i<=missing-1;i ){printf "|"};print "|"}}' myDataFile
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|c|d|e|f|g|h|i|m|||
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|d|e|f|g|h|i|j|k|l|m|
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|c|d|e|f|g|h|i|j|k|l|m
a|b|c|d|e|f|i|j|k|l|m||
a|b|c|d|e|f|g|h|i|j|k|l|m
And the awk made pretty and explained:
{
missing = 13 - NF # store the number of missing fields
if (missing == 0) { # if all fields are present
print $0 # just print the line
} else { # otherwise
printf "%s", $0 # first print the line
for (i = 1; i <= missing - 1; i ) { # then pad the line with delimiters (w/o a newline)
printf "|"
}
print "|" # followed by a last one WITH a newline
}
}