Home > front end >  Configuring a channel for receiving messages on a web socket: encoding appears to be forgiving?
Configuring a channel for receiving messages on a web socket: encoding appears to be forgiving?

Time:11-03

I apologize for the length of this question and will be happy to change it if someone informs me of a more concise way to express it.

I referenced these two documents when coding Tcl to receive messages on a web socket and decode them: one from MDN and the other the RFC named in the MDN document. Eventually with some help from those on SO, I was able to decode all my messages and, therefore, thought I understood at least what I needed to. However, because of having a bit of trouble getting a curly apostrophe to render (answered in this SO question) I'm a bit unsure why my code worked when I failed to fully consider the proper configuration of the channel itself.

This is the configuration before sending response headers to the connection upgrade request.

chan configure $sock -encoding iso8859-1 -translation crlf -buffering full

After they were sent, the configuration was as below; and I was able to receive and decode the incoming messages on the socket, all of which had a payload that was originally text on the client and marked as such in the first XOR frame. Since an encoding isn't provided, I assume it remained as iso8859-1.

chan configure $sock -buffering full -blocking 0 -translation binary

After having the issue with the curly apostrophe, I changed the configuration to include the -encoding utf-8 as below. The socket is configured like this just before a text message is sent on the socket, but I wondered what would happen if it was configured like this at the start also; and in-coming messages are still received and decoded accurately.

chan configure $sock -encoding utf-8 -buffering full -blocking 0 -translation binary.

And in-coming messages are recevied and decoded without error when configuring the channel as binary also.

chan configure $sock -encoding binary -buffering full -blocking 0 -translation binary

I'm sure I am not understanding something fundamental but cannot explain why it works. Does channel configuration only affect data sent from Tcl on the channel or does it affect data received also? I thought XOR frames from the client were always binary only, regardless of what the payload was ultimately to be after decoding.

Why does it appear that the channel configuration, at least the encoding, does not matter when receiving messages in Tcl over the web socket? Did Tcl recognize the encoding of the frames from the client and adjust automatically despite my errors?

The reason I thought about this is that (after following the guidance in the question linked above) I set the socket's encoding to binary, until just before sending a text message from Tcl to the client, at which time it is set to utf-8, and wondered what would happen if a message was received on the socket in the very short interval after the socket's encoding is changed to utf-8 and before it is sent and the socket's encoding returned to binary. But it seems that it makes no difference at all.

Also, when sending image scans back to the client on the socket, even if the socket is left encoded as utf-8, they still render in the browser. The scans are stored in SQLite as BLOBs and read using:

set fdBlob [dbws incrblob -readonly wse lexi_raw scan $rowid]
fconfigure $fdBlob -translation binary
fcopy $fdBlob $sock

and the client receives the message and renders it even when $sock has encoding of utf-8.

Why does it not error when I attempt to send binary on a socket encoded as utf-8?

Thank you.

CodePudding user response:

I fear, given the lengthy and very verbose Q, I might miss important details. So, this is rather an attempt of approaching an answer; you seem to be unaware of one important detail:

Whenever you set -translation to binary, it will also change -encoding to binary. Hence: Your three examples of chan configure are all the same, because of the trailing -translation binary!

Watch:

% set c [file tempfile]
file5
% chan configure $c -encoding utf-8
% chan configure $c -encoding
utf-8
% chan configure $c -translation binary
% chan configure $c -encoding
binary
% chan close $c

See the details on chan configure -translation


So, you are left with an inchan and an outchan to chan copy aka fcopy with no conversion whatsover between the two of them:

No conversion is done if both channels are set to encoding binary and have matching translations.

I can only speculate about the client behaviour, but assuming that a text message as database BLOB is in valid UTF-8, the client might just assume a safe fallback? As for images, as you have no conversion at all, they will be just fine.

  • Related