Home > Blockchain >  Handling special characters in bash script
Handling special characters in bash script

Time:01-02

I'm not familiar with bash scripting. Maybe this is a silly question. But I couldn't find the answer. I'm working on a bash script that mimics the behavior of the command ls -sh but that actually uses du -sh to get file and folder sizes. And it sorts the output. pretty much like du -sh* | sort -h with colors.

#!/usr/bin/bash

if [ "$#" = "0" ]
then
    du -sh *|awk -f /path/to/color-ls.awk|sort -h
else
    du -sh $@|awk -f /path/to/color-ls.awk|sort -h
fi

where ls-color.awk is:

# color-ls.awk
size=$1;
name=$2;
for (i=3; i<=NF; i  )
{
    tmp=(name " " $i);
    name=tmp
}
# filename=($0 ~ /'/)? ("\"" name "\""):("'" name "'")
filename=("'" name "'")
printf $1 " "
cmd=("ls -d " filename " --color")
system(cmd)

an awk script that uses ls --color to color the output of du -sh

My scripts works fine with most file names even ones containing spaces. but it has some problems involving special characters that I didn't know how to fix.

1. When run without arguments:

It is interpreting any file name that contains single quotes causing an error

sh: 1: Syntax error: Unterminated quoted string

2. When run with arguments:

The same problem as without arguments. And it's interpreting a file name with spaces as two names.

example: when used on a folder named VirtualBox VMs or when given * as an argument in my home directory here's it's output:

du: cannot access 'VirtualBox': No such file or directory
du: cannot access 'VMs': No such file or directory

3. What I want:

I want the script to skip special characters and pass them as they are to du

4. What I tried:

I tried adding double quotes before and after each file name

parse(){
    for arg in $@
    do
        printf "\"$arg\"\n"
    done
}

but it didn't seem to work. du doesn't accept quotes appended to the file name.

du: cannot access '"VirtualBox': No such file or directory
du: cannot access 'VMs"': No such file or directory

Also, replacing quotes with \' doesn't help ether. maybe I'm just doing it wrong.

# du -sh $(printf "file'name\n" |sed "s/'/\\\'/g")
du: cannot access 'file\'\''name': No such file or directory
# ls file\'name 
"file'name"

Same goes for spaces

du: cannot access 'VirtualBox\': No such file or directory
du: cannot access 'VMs': No such file or directory

5. Extra:

I wanted to make the script works as normal ls -sh would work but with sorted output and with more accurate results when it comes to folders. but this script works like ls -sh -d when arguments are supplied to it. making lh Desktop shows the size of Desktop instead of the size of the individual files and folders inside Desktop. I believe this can be fixed with a loop that checks if each argument is a file or a folder and execute du -sh accordingly then sort.

#!/usr/bin/bash

if [ "$#" = "0" ]
then
    du -sh *|awk -f /path/to/color-ls.awk|sort -h
else
    for i in $@
    do
        if [[ -d "$i" ]]; then
            du -sh $i/* |awk -f /path/to/color-ls.awk
        else
            du -sh "$i" |awk -f /path/to/color-ls.awk
        fi
    done|sort -h
fi

But for some reason it didn't work

# lh Desktop/
du: cannot access 'Desktop/*': No such file or directory

whereas using du -sh Desktop/* usually works in interactive shell.

Thanks in advance.

CodePudding user response:

Please do not post so much in one question. Please one problem per question. One script per question, etc.

Make sure to check your scripts with shellcheck. It will catch your mistakes. See https://mywiki.wooledge.org/Quotes .

  1. When run without arguments:

filename=("'" name "'") inside awk script is a invalid way to pass anything with ' quotes to system() call, so you are getting unterminated ' error, as expected, because there will be 3 ' characters. Fix the AWS script, or better rewrite it in Bash, no need for awk. Maybe rewrite it all in Python or Perl.

Moreover, tmp=(name " " $i); deletes tabs and multiple spaces from filenames. It's all meant to work with only nice filenames.

The script will break on newlines in filenames anyway.

  1. When run with arguments:

$@ undergoes word splitting and filename expansion (topics you should research). Word splitting splits the input into words on spaces. Use "$@". Quote the expansions.

  1. What I want:

You'll be doing that with "$@"

  1. What I tried:

The variable content is irrelevant. You have to change the way you use the variable, not it's content. I.e. use quotes around the use of the variable. Not the content.

  1. Extra:

You did not quote the expansion. Use "$i" not $i. It's "$i"/*. $1 undergoes word splitting.


And finally, after that all, your script may look like, with GNU tools:

if (($# == 0)); then
   set -- *
fi
du -hs0 "$@" |
sort -zh |
sed -z 's/\t/\x00/' |
while IFS= read -r -d '' size && IFS= read -r -d '' file; do
   printf "%s " "$size";
   ls -d "$file"
done

Also see How can I find and safely handle file names containing newlines, spaces or both? https://mywiki.wooledge.org/BashFAQ/001 .

Also, you can chain any statements:

if stuff; then
   stuff1
else
   stuff2
fi | 
sort -h |
awk -f yourscriptrt 

And also don't repeat yourself - use bash arrays:

args=()
if stuff; then
  args=(*)
else
  args=("$@")
fi
du -hs "${args[@]}" | stuff...

And so that sort has less work to do, I would put it right after du, not after parsing.

CodePudding user response:

Since you didn't include shopt -s nullglob, it's likely that Desktop/* didn't expand to any file which is odd unless there really are no files there, you have enabled nullglob in interactive mode, and du -sh doesn't actually display the sizes of the files in Desktop.

It's also likely that you're calling the script from where Desktop/ doesn't exist.

You can add a debug statement which prints $PWD. You can also try running the script with bash -x.

In your script I suggest enabling nullglob and then modifying it so du -sh isn't called if target directory contains no files.

Something like:

set -- "$i"/*; [[ $# -gt 0 ]] && du -sh -- "$@" ...

Also $@ should be quoted when being expanded.

for i in "$@"; do

This can be simplified to for i; do, but we will modify the positional parameters inside the loop so we expand "$@" instead.

You can also choose to store the expanded files inside an array as well.

  • Related