Home > Back-end >  Validate whether the string is ASCII or Non Ascii in linux
Validate whether the string is ASCII or Non Ascii in linux

Time:12-18

I have a string (not saved in file) which can have ascii or non ascii characters.I want to find out whether the given string contains ascii or non ascii in linux.I tried using grep but grep expects the file instead of string.

Example 1

Input

abc$@

Expected Output

The given string is ascii.

Example 2

Input

testt ‘’Lab

Expected Output

The given string is NOT ascii.

Appreciate your help.

Thanks

CodePudding user response:

If you look at an ASCII table, something you might notice is that it only uses 7-bits to encode it's value despite a char being stored in a byte. The 8th (high) bit is never set.

I would iterate the bytes in the string and and check the 8th bit. If there are any 8th bits set, then it's probably not ASCII clean.

However, if you're writing a shell script vs. a program in C what you end up doing would by pretty different. This thread seems promising: https://unix.stackexchange.com/questions/194435/check-whether-text-contains-non-ascii-characters-in-a-shell-script

CodePudding user response:

You can use the file command. This command is used to determine the type of a file. For textual files it will output the encoding (if it is valid).

You can use it with -b flag to display the encoding only.

  • Related