Home > Software engineering >  Null value in CSV with Bash
Null value in CSV with Bash

Time:06-28

I am trying to write a Bash script that checks and returns IDs of rows in CSV that fail certain criteria. A sample CSV is like below, I am thinking the [ -z {$CATEGORY} ] menthod to identify null value cell in CATEGORY column of the CSV. However, it seem that my if statement is not catching the null value in the CSV, hence need help

ID,DATE,PRODUCT CODE,CATERGORY
1,01/01/2000,10009,1
2,02/01/2000,9999,2
3,25/01/2000,1009,3
4,15/09/2000,2001,5
5,09/25/2000,2003,4
6,09/10/01,2091,P
7,20/02/2002,3098,6
8,01/03/2003,4097,3
9,03/04/2004,5000,2
10,05/02/2013,4000,1
11,10/01/2015,9,

This is my bash script code, the null value is in the row with ID = 11

#!/bin/bash
FILE=${1}
IFS=$'\n'
((c=-1))
for row in $(cat $FILE)
do
        ((c  ))
        if ((c==0))
                then
                        continue
        fi
        IFS=','
        read ID DATE PRODUCT CATEGORY <<<${row}

                if [ -z {$CATEGORY} ];
                then
                     echo "$ID" >> file.txt
                fi
done

CodePudding user response:

-z {$CATEGORY} should be -z ${CATEGORY}, but read ID ... <<< ${row} will assign only ID... Try:

#!/bin/bash

while IFS=, read -r ID DATE PRODUCT CATEGORY; do
  if [[ "$CATEGORY" =~ ^[[:space:]]*$ ]]; then
    echo "$ID"
  fi
done < <( tail -n 2 "$1" ) > file.txt

Note that awk or sed would be much faster and simpler for this (see, for instance, https://mywiki.wooledge.org/DontReadLinesWithFor). Example with awk (tested with recent BSD and GNU awk):

awk -F, 'NR>1 && $NF ~ /^[[:space:]]*$/ {print $1}' "$FILE" > file.txt

Example with sed (tested with recent BSD and GNU sed):

sed -En 's/^([^,]*).*,[[:space:]]*$/\1/p' "$FILE" > file.txt
  • Related