Home > Mobile >  Decompressing gzipped ReadOnlyMemory<byte> before I do JsonDocument.Parse
Decompressing gzipped ReadOnlyMemory<byte> before I do JsonDocument.Parse

Time:07-09

The websocket client is returning a ReadOnlyMemory<byte>.

The issue is that JsonDocument.Parse fails due to the fact that the buffer has been compressed. I've got to decompress it somehow before I parse it. How do I do that? I cannot really change the websocket library code.

What I want is something like public Func<ReadOnlyMemory<byte>> DataInterpreterBytes = () => which optionally decompresses these bytes out of this class. How do I do that? Is it possible to decompress ReadOnlyMemory<byte> and if the handler is unused to basically to do nothing.

private static string DecompressData(byte[] byteData)
{
    using var decompressedStream = new MemoryStream();
    using var compressedStream = new MemoryStream(byteData);
    using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
    deflateStream.CopyTo(decompressedStream);
    decompressedStream.Position = 0;

    using var streamReader = new StreamReader(decompressedStream);
    return streamReader.ReadToEnd();
}

Snippet

private void OnMessageReceived(object? sender, MessageReceivedEventArgs e)
{
    var timestamp = DateTime.UtcNow;

    _logger.LogTrace("Message was received. {Message}", Encoding.UTF8.GetString(e.Message.Buffer.Span));

    // We dispose that object later on
    using var document = JsonDocument.Parse(e.Message.Buffer);
    var tokenData = document.RootElement;

CodePudding user response:

So, if you had a byte array, you'd do this:

private static JsonDocument DecompressData(byte[] byteData)
{
    using var compressedStream = new MemoryStream(byteData);
    using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
    return JsonDocument.Parse(deflateStream);
}

This is similar to your snippet above, but no need for the intermediate copy: just read straight from the GzipStream. JsonDocument.Parse also has an overload that takes a stream, so you can use that and avoid yet another useless copy.

Unfortunately, you don't have a byte array, you have a ReadOnlyMemory<byte>. There is no way out of the box to create a memory stream out of a ReadOnlyMemory<byte>. Honestly, it feels like an oversight, like they forgot to put that feature into .NET.

So here are your options instead.

The first option is to just convert the ReadOnlyMemory<byte> object to an array with ToArray():

// assuming e.Message.Buffer is a ReadOnlyMemory<byte>
using var document = DecompressData(e.Message.Buffer.ToArray());

This is really straightforward, but remember it actually copies the data, so for large documents it might not be a good idea if you want to avoid using too much memory.

The second is to try and extract the underlying array from the memory. This can be achieved with MemoryMarshal.TryGetArray, which gives you an ArraySegment (but might fail if the memory isn't actually a managed array).

private static JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
    if(MemoryMarshal.TryGetArray(byteData, out var segment))
    {
        using var compressedStream = new MemoryStream(segment.Array, segment.Offset, segment.Count);
        // rest of the code goes here
    }
    else
    {
        // Welp, this memory isn't actually an array, so... tough luck?
    } 
}

The third way might feel dirty, but if you're okay with using unsafe code, you can just pin the memory's span and then use UnmanagedMemoryStream:

private static unsafe JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
    fixed (byte* ptr = byteData.Span)
    {
        using var compressedStream = new UnmanagedMemoryStream(ptr, byteData.Length);
        using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
        return JsonDocument.Parse(deflateStream);
    }       
}

The other solution is to write your own Stream class that supports this. The Windows Community Toolkit has an extension method that returns a Stream wrapper around the memory object. If you're not okay with using an entire third party library just for that, you can probably just roll your own, it's not that much code.

  • Related