Home > Blockchain >  Sending a large object to a remote server using Streams and HttpClient
Sending a large object to a remote server using Streams and HttpClient

Time:10-03

I have a large object in my C# code which can be as large as 15GB. Internally it has a 2D array of doubles and 2 lists of strings which describe the rows and columns of the 2D array.

This object has a method WriteToTextWriter(StreamWriter s) which writes a header and the entire data in the 2D array to the StreamWriter s. The StreamWriter is initialized using a MemoryStream object.

I have another class which uses HttpClient to post data from a Stream to a remote server. It has a method PostStreamData(string URL, Stream s).

My current code is something like this:

var x = MyLargeObject();
using (var memStream = new MemoryStream())
using (var streamWriter = new StreamWriter(memStream))
{
    x.WriteToTextWriter(streamWriter);
    customClient.PostStreamData(url, memStream);
}

Internally, PostStreamData creates a StreamContent() using the stream object it gets passed in, sets this content as the Content property of the HttpRequestMessage object and then finally sends this using SenAsync method.

Since this uses MemoryStream, it fails when the object size gets larger than 2GB. See this: Failed to write large amount of data to stream

To overcome this, I used the HugeMemoryStream class implemented there. But now the issue is that I am using twice the memory. 15GB for the MyLargeObjet which is already in memory and then another 15GB for the HugeMemoryStream object created using it.

I think a better solution would be to implement a class based on Stream which uses a buffer of limited size but still allows for objects larger than 2GB. How to implement this? I am looking for some sample code. It doesn't have to be complete, but right now I don't even know how to start.

CodePudding user response:

You could inherit from Stream and keep a reference to MyLargeObject. Then you implement Read method where you serialize your largeobject to the byte array parameter of Read. You must implement Canseek, canwrite where you just return false. The other methods just throw notsupportedexception. You would use it like this:

var content = new StreamContent(new MyStream(mylargeobject))

Also check out this implementation : https://ec.europa.eu/digital-building-blocks/code/projects/EDELIVERY/repos/eessi-as4.net/browse/source/AS4/Eu.EDelivery.AS4/Streaming/VirtualStream.cs?at=a37db0be60a5c441fdb6c9d65f7c4c4621840b92

CodePudding user response:

I don't think it's a good idea at all using memory to manage data like this, you're better off writing it to disk and then posting it through a separate service with paralell uploads and retry logic.

Oh, if you're also working big binary files like this, might want to have a look at using some kind of diff algorithm to generate patch files and use them instead, like bsdiff or xdelta.

Useful Binary Diff Tool (other than msdn[apatch and mpatch], xdelta, bsdiff, vbindiff and winmerge)

  • Related