What is the best way to return variable size byte arrays and strings in C ?-CodePudding

I have a class that wraps C functions for reading and writing data using file descriptors

I'm currently stuck at read method.
I want to create a read method that wraps the C function ssize_t read(int fd, void *buf, size_t count);

The function above uses void *buf as an output and returns the number of bytes written in the buffer.
I want to have a method read that would return a variable size object that would contain that data or nullptr if no data was read.

What is the best way to do that?

EDIT: I already have a char array[4096] that I use to read data. I just want to return them and also give the caller the ability to know the length of the data that I return.

The char array[4096] is a member of the class that wraps C read. The reason I use it is to store the data temporarily before return them to the caller. Every time I call the wrapper read the char array will ovewriten by design. An upper layer will be responsible for concatenate the data and construct messages. This upper layer is the one that needs to know how much data has arrived.

The size of the char array[4096] is randomly chosen. It could be very small but more calls would be needed.

The object that contains the member char array will always be global.

I use C 17

Should I use std::vector or std::queue ?

CodePudding user response：

EDIT: I already have a char array[4096] that I use to read data. I just want to return them and also give the caller the ability to know the length of the data that I return.

Right, so the key information is that you don't want to copy that (or at least you don't want to force an additional copy).

Current preferred return type is std::span, but that's C 20 and you're still on 17.

Second preference is std::string_view. It'll work fine for binary data but may confuse people who expect it to be printable, not contain null terminators and so on.

Otherwise you can obviously return some struct or tuple with pointer & length (and possiblyerrno, which is otherwise discarded).

Returning something that might be nullptr is pretty much the least preferred option. Don't do it. It's actually harder to use correctly than the original C interface.

CodePudding user response：

You could use function overloading:

void read(int fileDescriptor, short int & variable)
{
    static_cast<void>(read(fileDescriptor, &variable, sizeof(variable));
}

void read(int fileDescriptor, int & variable)
{
    static_cast<void>(read(fileDescriptor, &variable, sizeof(variable));
}

You may want to also look into using templates.

CodePudding user response：

The general answer here is: Don't use mutable global state. It breaks reentrancy and threading. And don't compound the issue by trying to return views of mutable global state, which makes even sequential calls a problem.

Just allocate a per-call buffer and use that; if you want to allow the caller to provide a buffer, that's also acceptable. Examples would look like:

// Some class assumed to have an fd member for reading via the C API
class Reader
{
// Define member attributes, e.g. fd

public:
    std::string_view read(std::string_view buf) {
        ssize_t numread = read(fd, buf.data(), buf.size());
        // Error checking if applicable, presumably handling negative return values
        // by raising exception
        return buf.substr(0, numread); // Guaranteed copy-elision
    }

    std::string read(size_t max_read) {
        std::string buf(max_read, '\0');  // Allocate appropriately sized buffer
        auto view = read(buf);  // Delegate to view-based API
        buf.resize(view.size());  // Resize to match amount actually read
        return buf;  // Likely (but not guaranteed) NRVO based copy-elision
    }
}

std::string and std::string_view could be replaced with std::vector and std::span of some type in C 20 if you preferred.

This provides the caller with multiple options:

Call read with an existing std::string_view (maybe change to std::span for C 20) that the caller can reuse over and over
Call read with an explicit size and get a freshly allocated std::string with few if any no copies involved (NRVO will avoid copying the std::string being returned in most cases, though if the underlying read reads very little, the resize call might reallocate the underlying storage and trigger a copy of whatever real data exists)

For maximum efficiency, many callers calling this repeatedly would choose #1 (they'd just create a local std::string of a given size, pass it in as the std::string_view, then use the returned std::string_view to limit how much of the buffer they actually work with), but for simple one-off uses, option #2 is convenient.