Home > Blockchain >  How to make read-only data in ByteBuffer to be safely consumed by InputStream multiple times?
How to make read-only data in ByteBuffer to be safely consumed by InputStream multiple times?

Time:04-07

I am building an API and I want some read-only, static data to be loaded at startup. Below I am calling a remote API for such data to be loaded at startup and their SDK only has a return type of ByteBuffer:

class MyService {
  private ByteBuffer remoteData;

  @PostContruct
  public void init() {
    remoteData = callAPI(); // returns ByteBuffer as type
  }

  public getDataAndDoSomething(Request req) {
    try (Inputstream is = new ByteBufferBackedInputStream(remoteData)) {
      // proceed with ByteBufferBackedInputStream
    }
  }
}

The issue with above is that after initial invocation of getDataAndDoSomething(), remoteData is no longer consumable. This wouldn't be a issue if I make remoteData a local variable and call remote API each time, but I'd like to load remoteData only at startup.

I suspect I'd need to make a deepcopy of it somehow each time InputStream wants to consume it, but the ByteBuffer APIs are rather confusing. What is a good approach to make this safe to consume from multiple threads that invoke getDataAndDoSomething()?

CodePudding user response:

I have the idea of cloning the current ByteBuffer each time, you can use the method duplicate() :

ByteBuffer java.nio.ByteBuffer.duplicate()

Creates a new byte buffer that shares this buffer's content.

class MyService {
  private ByteBuffer remoteData;

  @PostContruct
  public void init() {
    remoteData = callAPI(); // returns ByteBuffer as type
  }

  public getDataAndDoSomething(Request req) {
    ByteBuffer remoteToBeUsed = remoteData.duplicate();
    try (Inputstream is = new ByteBufferBackedInputStream(remoteToBeUsed )) {
      // proceed with ByteBufferBackedInputStream
    }
  }
}

CodePudding user response:

No need. ByteBuffers are purely memory constructs (which does mean, if that API returns a tonne of data (say, 500MB or more), this is not a good idea!) - you can trivially reset them.

Buffers start at 0 (that part's simple enough), and have a specific capacity (a set size; buffers do not grow or shrink). They even have a mark in order to facilitate reading less than the full capacity and then 'flipping' the buffer to then read what you just wrote. Their usual purpose is to be an intermediary: A 'writer' process fills it, until capacity or not, and then the buffer is 'flipped' so that a 'reader' process reads what was just put in it, and when it is done, it resets back to 'write' mode, back and forth, over and over. Thus, they have 4 numbers: start (always 0), position, mark, and end.

The buffer as provided presumably begins in this state (as returned by callAPI():

start = 0
position = 0
mark = the total size of the data sent by the API
end = something. Hopefully, equal to mark, otherwise its wasted memory

When you then use it as source for your ByteBufferBackedInputStream, whatever consumes the inputstream will end up moving the position pointer forward. Assuming it reads the entire content, the pointer ends up being equal to mark.

Thus all you need to do to get back to the state that it was, is to reset the position back to 0.

Which is trivial to do, fortunately:

remoteData.position(0);

and you can use it again.

A ByteBuffer object contains the actual data (usually, a byte[], but it's abstracted, it could be something else. But, usually, it's byte-array backed), as well as those 4 pointers.

Hence, none of this is going to work if you try to make 4 BBBInputStreams simultaneously and hand em off to various threads. They all just read, so the data itself is not going to get corrupted by this, but those 4 pointers? You want each thread to have its own.

You can do that too, however: You can create new BB objects that use the same backing buffer:

ByteBuffer clone = remoteData.duplicate();

The name 'duplicate' is a bit of a misnomer - this does not duplicate the backing data, but it does give you a clone with independent start/position/mark/end values. Duplicate the buffer 3x for a total of 4 buffers, handing each thread one of these copies.

  • Related