Home > Software engineering >  Receiving byte stream using TCPClient consuming too much CPU
Receiving byte stream using TCPClient consuming too much CPU

Time:12-08

I have a service which receives large byte streams from 3rd party systems using a TcpListener. Unfortunately, I don't have the option to change protocols or to something like WCF.

When lots of systems are sending data at the same time, the CPU usage on my server spikes to nearly 90% and causes problems for other services. After taking a profile of my service, it looks like the CPU is being used reading the byte arrays from the NetworkStream into a MemoryStream. Here's an example of my server code

public void StartListening(CancellationToken token) {
    var listener = new System.Net.Sockets.TcpListener(System.Net.IPAddress.Any, port);
    listener.Start();

    while (!token.IsCancellationRequested) {
        var socket = listener.AcceptSocket();
        TcpClient tcpClient = new TcpClient(socket);
        var stream = tcpClient.GetStream();

        Task.Run(()=> ReadStream(stream, token));
    }
}

private void ReadStream(NetworkStream stream, CancellationToken token){
    int offset = 0;
    int size = EXPECTED_FILE_SIZE;
    var inStream = new MemoryStream(size);
    while (size > 0 && !token.IsCancellationRequested) {
        try {
            int readin = stream.Read(inStream.GetBuffer(), offset, size);
            size -= readin;
            offset  = readin;
        }
        catch (Exception ex) {
            Console.WriteLine(ex.Message);
            return;
        }    
    }
    //Do something with the memory stream
}

I wrote a test client which sends multiple byte streams to the service at a time and every time the CPU spikes. I've tried a few things server side to fix the issue (including using BeginRead and EndRead on the network stream). The only thing that seems to help is when I send larger chunks of the byte streams from the client, but I have no control over the 3rd party systems which send the data.

I have thought that it might work to accept all socket connections, but then limit the number of "ReadStream" tasks, but I don't know if there are any detrimental effects of accepting a socket and then not reading from it for a while.

CodePudding user response:

instead of continuously reading the stream use an async call back like the one built in so it only calls when needed, you could do something like

public void StartListening(CancellationToken token)
{
    var listener = new System.Net.Sockets.TcpListener(System.Net.IPAddress.Any, port);
    listener.Start();

    while (!token.IsCancellationRequested)
    {
        var socket = listener.AcceptSocket();
        TcpClient tcpClient = new TcpClient(socket);
        stream = tcpClient.GetStream(); //you need to have the stream declared at the start of the program

        //set it up with th receive callback called RecieveCallback
        //the 4096 is just so that any data more than that won't be read but it is highly unlickely for a peice of data above 4096 to appear
        stream.BeginRead(4096, 0, 4096, ReceiveCallback, null);

        Task.Run(() => ReadStream(stream, token));
    }
}
//then we setup the RecieveCallback event
private void ReceiveCallback(IAsyncResult _result)
{
    try
    {
        int _byteLength = stream.EndRead(_result); //find how much data it is
        if (_byteLength <= 0)
        {
            //if this is true you are disconnected
            return;
        }

        byte[] _data = new byte[_byteLength];
        Array.Copy(4096, _data, _byteLength);

        DoSomethingWith(_data)
        stream.BeginRead(4096, 0, 4096, ReceiveCallback, null); //start listening again
    }
    catch
    {
        //disconnect
    }
}
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

By the way running the code snippet won't work, it was the only way form me to write out the code in a way that it would look good

CodePudding user response:

I would personally try to solve it by creating a separate BackgroundWorker for each connection that dies when finished. That would let you make a better use of CPU threads making the OS scheduler handle it.

It could also be that the CPU is not powerful enough to handle so many incoming data at that many times.

Otherwise, i would use a Gateway which gets the data and sends it to a RabbitMQ queue, and process that data at lower speeds.

  • Related