My professor has assigned an encryption algorithm for our homework in C . Instead of outputting in binary, he'd like the encrypted text (plain text that has run through the cipher) to output as a string in stdout.
The encryption algorithm will typically have an output greater than 128 (which is outside the ASCII range). These are usually replaced with symbols like � or square boxes.
When I go to concatenate these symbols to the output (ciphertext), they sometimes disappear depending on neighboring symbols.
Here's an example:
unsigned char one = 244; // (244 is the 16-bit "output" from the algo)
unsigned char two = 137; // (same as above)
std::string con = "";
con = (one '\0');
con = (two '\0');
std::cout << con << std::endl;
The output will be �
, where one of the characters is dropped.
If, however, it was unsigned char one = 244;
and unsigned char two = 244;
, the output in the console will be ��
, so the second char doesn't vanish. I'm not sure why some of these combinations work and others don't. Is there a safer way to concatenate these characters that are outside the normal ASCII range?
I have also tried some things I've found on the site, like:
con = (one '0');
// but this outputs the wrong text: if it were con = (65 '0') the
// output is 'q' instead of 'A', but all the symbols generate
// with this
con = (two '0');
I also tried the following, but it has the same results as the first (missing symbols).
con = one;
con = two;
Thank you!
CodePudding user response:
Nothing is lost, all your characters are there:
#include <iostream>
#include <string.h>
int main(int argc, char **argv)
{
unsigned char one = 244; // (244 is the 16-bit "output" from the algo)
unsigned char two = 137; // (same as above)
std::string con = "";
con = (one '\0');
con = (two '\0');
for (unsigned int i=0;i<strlen(con.c_str());i )
{
printf("char %d = %d\n", i, (unsigned char) con.c_str()[i]);
}
return 0;
}
And the result:
$ g -g -O0 main.cpp -o main
$ ./main
char 0 = 244
char 1 = 137
It is sometimes easier to inspect string as raw, c-style strings, and print their content in a format that suits you best.
CodePudding user response:
First of all, you should know that '\0' and '0' are two distinct characters having ASCII codes 0 and 48 respectively.
- The statement
con = (one '\0');
is equivalent tocon = (244 0);
. - But the statement
con = (one '0');
is equivalent tocon = (244 48);
and 244 48 == 292 but the max value ofunsigned char
is 255. So it will cause an overflow and then wrap around and you'll end up with 36 (292 - 256) and 36 is for '$' character. The same is true forcon = (two '0');
.
I would suggest you to write something like below and it's the C way of doing it:
#include <iostream>
int main()
{
unsigned char one = 244; // (244 is the 16-bit "output" from the algo)
unsigned char two = 137; // (same as above)
std::string con = "";
con = one;
con = '\0';
con = two;
con = '\0';
std::cout << "con: <" << con << ">\n" << '\n';
for ( size_t idx = 0; idx < con.length( ); idx )
{
std::cout << "index " << idx << ": <"
<< static_cast<unsigned char>( con[ idx ] ) << ">" << '\n';
/* Notice the operator
besides static_cast */
}
}
In the Windows command prompt, this gives:
con: <⌠ ë >
index 0: <244>
index 1: <0>
index 2: <137>
index 3: <0>
As you can see, each '\0' acts like a space character separating the actual data characters.
Also, notice how the
operator causes a variable of type char
or signed char
or unsigned char
to be printed as an integer. Read more about it here: How to output a character as an integer through cout?