Home > Blockchain >  How to reduce memory usage when returning a file over a stream with WCF?
How to reduce memory usage when returning a file over a stream with WCF?

Time:09-17

I have 1 large and many small files being sent to a server each day. The server parses and creates/recreates/updates a sqlite DB when it receives these. The client machines also need this DB, and can request it or request updates. Everything is connected via LAN.

The client machines need the DB as they do not have reliable internet access so using a cloud DB is not an option. The server may also be down so asking the server for single queries is not reliable.

The large file update touches every single row in the DB since it's possible some information was missed in the deltas. As a result we can not send the large delta to the clients and I believe makes more sense to just recreate them on the client.

Since the client machines are poor, querying the server for rows and making large deltas on those machines is very time intensive and can take 2 hours. Since this occurs daily, having 2 out of 24 hours of stale data is not an option.

We decided on having the clients request the entire db, when this happens the server compresses and sends the db, which only takes a few minutes.

To do this I've set up the server to compress the db, and then return a MemoryStream.

var dbCopyPath = ".\\db_copy.db";

using (var readFileStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
    Log("Compressing db copy...");
    using (var writeFileStream = new FileStream(dbCopyPath, FileMode.OpenOrCreate, FileAccess.Write, FileShare.Read))
    {
        using (var gzipStream = new GZipStream(writeFileStream, CompressionLevel.Optimal))
        {
            readFileStream.CopyTo(gzipStream);
        }
    }
}

return new MemoryStream(File.ReadAllBytes(dbCopyPath));

I've tried some other methods like writing a FileStream to a GZipStream(new MemoryStream()) and returning the GZipStream.ToArray(), or just returning the memory stream straight from the file.

The issue with all the options I've tried is that they all reserve a large amount of memory (or just don't work). I've seen the process consistently reserve 600mb of memory when running this when I just have a 200mb file after compression. If the files that come in get too large this will eventually start giving me out of memory exceptions. On the client side, I'm able to just read the stream like this:

var dbStream = client.OpenRead(downloadUrl);

This makes it so the memory usage does not spike at all on the client when downloading the data.

My ideal solution would be a way to stream data directly from the file over the server to the client. I'm not sure if this possible since I've tried this with many different combinations of streams, but if there was some way to have a lazy stream like the server doesn't load portions of the stream until the client needs them for writing that would be ideal, though again I'm not sure if that's possible or even fully makes sense.

I tried my best to avoid the XY problem so if there's anything I missed please let me know, I appreciate any help with this. Thank you

CodePudding user response:

Since I don't know how you transfer your data (NetworkStream byte[], etc.), you could also return your compressed database directly as a FileStream and thus do without the MemoryStream:

private static Stream GetCompressedDbStream(string path)
{
  var tempFileStream = new TemporaryFileStream();

  try
  {
    using (var readFileStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
    {
      using (var gzipStream = new GZipStream(tempFileStream, CompressionLevel.Optimal, true))
      {
        readFileStream.CopyTo(gzipStream);
      }
    }

    tempFileStream.Seek(0, SeekOrigin.Begin);
    return tempFileStream;
  }
  catch (Exception)
  {
    // Log to console or alert user.
    tempFileStream.Dispose();
    throw;
  }
}

In order to manage the scope of the temporary file correctly, I have my implementation of a class 'TemporaryFileStream' here. This will delete the temporary file as soon as the stream is disposed:

public class TemporaryFileStream : Stream, IDisposable
{

  private readonly FileStream _fileStream;
  private bool _disposedValue;

  public override bool CanRead => _fileStream.CanRead;

  public override bool CanSeek => _fileStream.CanSeek;

  public override bool CanWrite => _fileStream.CanWrite;

  public override long Length => _fileStream.Length;

  public override long Position
  {
    get => _fileStream.Position;
    set => _fileStream.Position = value;
  }

  public TemporaryFileStream()
  {
    _fileStream = new FileStream(Path.GetTempFileName(), FileMode.Open, FileAccess.ReadWrite);
    new FileInfo(_fileStream.Name).Attributes = FileAttributes.Temporary;
  }

  protected virtual void Dispose(bool disposing)
  {
    if (!_disposedValue)
    {
      if (disposing)
      {
        _fileStream.Dispose();
        File.Delete(_fileStream.Name);
      }

      _disposedValue = true;
    }
  }

  public void Dispose()
  {
    Dispose(disposing: true);
    GC.SuppressFinalize(this);
  }

  public override void Flush() => _fileStream.Flush();
  public override int Read(byte[] buffer, int offset, int count) => _fileStream.Read(buffer, offset, count);
  public override long Seek(long offset, SeekOrigin origin) => _fileStream.Seek(offset, origin);
  public override void SetLength(long value) => _fileStream.SetLength(value);
  public override void Write(byte[] buffer, int offset, int count) => _fileStream.Write(buffer, offset, count);

}

You can then use a simple CopyTo or Read in order to transfer the data efficiently:

using var stream = GetCompressedDbStream(@"DbPath");
// CopyTo ...
  • Related