Home > front end >  Issues with iconv command in script
Issues with iconv command in script

Time:11-22

I am trying to create a script which detects if files in a directory have not UTF-8 characters and if they do, grab the file type of that particular file and perform the iconv operation on it.

The code is follows

find  <directory> |sed '1d'><directory>/filelist.txt

while read filename
do
file_nm=${filename%%.*}
ext=${filename#*.}
echo $filename
q=`grep -axv '.*' $filename|wc -l`
echo $q
r=`file -i $filename|cut -d '=' -f 2`
echo $r
#file_repair=$file_nm
if [ $q -gt 0 ]; then
iconv -f $r -t utf-8 -c ${file_nm}.${ext} >${file_nm}_repaired.${ext}

mv ${file_nm}_repaired.${ext} ${file_nm}.${ext}

fi
done< <directory>/filelist.txt

While running the code, there are several files that turn into 0 byte files and .bak gets appended to the file name.

ls| grep 'bak' | wc -l

36

Where am I making a mistake?

Thanks for the help.

CodePudding user response:

It's really not clear what some parts of your script are supposed to do.

Probably the error is that you are assuming file -i will output a string which always contains =; but it often doesn't.

find  <directory> |
# avoid temporary file
sed '1d' |
# use IFS='' read -r
while IFS='' read -r filename
do
    # indent loop body
    file_nm=${filename%%.*}
    ext=${filename#*.}
    # quote variables, print diagnostics to stderr
    echo "$filename" >&2
    # use grep -q instead of useless wc -l; don't enter condition needlessly; quote variable
    if grep -qaxv '.*' "$filename"; then
        # indent condition body
        # use modern command substitution syntax, quote variable
        # check if result contains =
        r=$(file -i "$filename")
        case $r in
          *=*)
            # only perform decoding if we can establish encoding
            echo "$r" >&2
            iconv -f "${r#*=}" -t utf-8 -c "${file_nm}.${ext}" >"${file_nm}_repaired.${ext}"        
            mv "${file_nm}_repaired.${ext}" "${file_nm}.${ext}" ;;
          *)
            echo "$r: could not establish encoding" >&2 ;;
        esac
    fi
done

See also Why is testing “$?” to see if a command succeeded or not, an anti-pattern? (tangential, but probably worth reading) and useless use of wc

The grep regex is kind of mysterious. I'm guessing you want to check if the file contains non-empty lines? grep -qa . "$filename" would do that.

  • Related