We are using a set of Lambda functions written in Python to read and write data to a DynamoDB table, including a column which is compressed binary data. The Lambda functions work perfectly fine, able to decompress the binary data successfully and use it.
We recently received a question that required us to get into the console and investigate the binary data in the DynamoDB table. This is when we ran into issues.
When we compress the data, Python shows us something like this:
b'x\x9c\xed\x99\xcfO\xdb0\x14\xc7\xef\xfc\x15V\xcf\x14%M\xe9\xc6n]\t\x10\t\x95\n\x18\x97\xa9\xb2\x8c'
In the AWS Console, however, we see values that look more like this:
'eJzsveuOHUmSHvgqifo1i1U6/H7Zf1lkspoDMkkkydKMBIHoqa7daexEt7a6WoBD0LvL7GSeZJi5hceh5cnkYoaUutnD7zMPjzgR'
Putting the second value into Python and trying to decompress it doesn't work, even when I convert it to a bytes array or read it as a binary file or whatever.
Does anyone know a good way to get something closer to the first value through the AWS console, or at least convert the second value to the first value, so that we can run it through the decompress function and get the actual value? If not, can someone at least give me the terminology of what to call the format of these two? They are obviously both binary data, but I'm sure there's a good name to describe how they are encoded or represented here.
Thanks!
CodePudding user response:
The DynamoDB console is showing you a base64-encoded representation of the underlying binary attribute value.
I'm assuming that's a reasonable thing to do so that the console can safely display arbitrary binary attribute values.
If you decode the displayed value, as follows, you can recover the original Python bytes:
base64.b64decode('eJzsv...')
# Result is: b'x\x9c\xec\xbd...'