Home > Enterprise >  Converting a string into Unsigned int 8 array
Converting a string into Unsigned int 8 array

Time:07-19

I am new to bash and I am trying to convert a swift obfuscation into a bash script.

Basically, I want to convert a string into an Unsigned-Int 8 array (UTF-8).

For example,

"hey" = [104, 101, 121] (UTF-8 UINT8 value)
"example" = [101, 120, 97, 109, 112, 108, 101] (UTF-8 UINT8 value)

Does anyone know if this is possible?

CodePudding user response:

The following shell script converts input in the for of hey into the string [104, 101, 121].

# Print hey
printf "%s" hey |
# convert to hex one per line
xxd -p -c 1 |
# convert to decimal one per line
xargs -I{} printf "%d\n" 0x{} |
# Join lines with comma
paste -sd, |
# Add spaces after comma
sed 's/,/, /g' |
# Add [ ]
{ echo -n '['; tr -d '\n'; echo ']'; }
# echo "[$(cat)]"

The script is not aware of input encoding - the script only translates bytes representation. The input string has to be already in the desired encoding. Use iconv to convert between encodings.

CodePudding user response:

Using pure bash, no external programs:

#!/usr/bin/env bash                                                                                                                                                                                                                              

to_codepoints() {
    local LC_CTYPE=C IFS=, n
    local -a cps
    # Iterate over each byte of the argument and append its numeric value to an array                                                                                                                                                            
    for (( n = 0; n < ${#1}; n   )); do
        cps =( $(printf "%d" "'${1:n:1}") )
    done
    printf "[%s]\n" "${cps[*]}"
}

to_codepoints hey
to_codepoints example
to_codepoints $'\u00C4ccent'

outputs

[104,101,121]
[101,120,97,109,112,108,101]
[195,132,99,99,101,110,116]
  • Related