I'm trying to understand how to read/write binary encoded versions of a simple struct, like below:
typedef struct Tuple {
uint32_t length;
uint8_t* data;
} Tuple;
I have the following code, which can correctly write a Tuple
record in Java to binary data in a MemorySegment
directly. But trying to read that data back fails -- whereas a simple C program can read it just fine.
Caused by: java.lang.IllegalArgumentException: Misaligned access at address: 16
at java.base/java.lang.invoke.VarHandleSegmentViewBase.newIllegalArgumentExceptionForMisalignedAccess(VarHandleSegmentViewBase.java:57)
at java.base/java.lang.invoke.VarHandleSegmentAsInts.offsetNoVMAlignCheck(VarHandleSegmentAsInts.java:100)
at java.base/java.lang.invoke.VarHandleSegmentAsInts.get(VarHandleSegmentAsInts.java:111)
at com.example.Tuple.deserialize(App.java:233)
What am I doing wrong in the below?
record Tuple(int size, byte[] data) {
public static ValueLayout.OfInt SIZE_FIELD = ValueLayout.JAVA_INT.withName("size");
public static ValueLayout.OfAddress DATA_FIELD = ValueLayout.ADDRESS.withName("data").withBitAlignment(32);
public static GroupLayout LAYOUT = MemoryLayout.structLayout(SIZE_FIELD, DATA_FIELD).withName("Tuple");
public static VarHandle VH_SIZE = LAYOUT.varHandle(MemoryLayout.PathElement.groupElement("size"));
public static VarHandle VH_DATA = LAYOUT.varHandle(MemoryLayout.PathElement.groupElement("data"));
Tuple(byte[] data) {
this(data.length, data);
}
public static Tuple deserialize(MemorySegment segment) {
int size = (int) VH_SIZE.get(segment);
byte[] data = segment
.asSlice(SIZE_FIELD.byteSize())
.toArray(ValueLayout.JAVA_BYTE);
return new Tuple(size, data);
}
public int byteSize() {
return (int) (size ValueLayout.JAVA_INT.byteSize());
}
public void serialize(MemorySegment segment) {
VH_SIZE.set(segment, size);
segment
.asSlice(ValueLayout.JAVA_INT.byteSize())
.copyFrom(MemorySegment.ofArray(data));
}
}
public static void main(String[] args) throws Exception {
Tuple tuple = new Tuple("Hello".getBytes());
File file = new File("tuple.bin");
file.createNewFile();
try (MemorySession session = MemorySession.openConfined()) {
MemorySegment segment = session.allocate(tuple.byteSize() * 2L);
tuple.serialize(segment);
byte[] bytes = segment.toArray(ValueLayout.JAVA_BYTE);
Files.write(file.toPath(), bytes);
} catch (Exception e) {
throw new RuntimeException(e);
}
// Now read the file back in
try (MemorySession session = MemorySession.openConfined()) {
byte[] bytes = Files.readAllBytes(Paths.get("tuple.bin"));
MemorySegment segment = MemorySegment.ofArray(bytes);
Tuple tuple2 = Tuple.deserialize(segment);
System.out.println(tuple2);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
I use the below C program to confirm it works there:
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
typedef struct Tuple {
uint32_t length;
uint8_t* data;
} Tuple;
const char* EXAMPLE_FILE = "tuple.bin";
int main() {
FILE* file = fopen(EXAMPLE_FILE, "rb");
if (file == NULL) {
printf("Could not open file %s\n", EXAMPLE_FILE);
return 1;
}
Tuple tuple;
fread(&tuple.length, sizeof(uint32_t), 1, file);
tuple.data = malloc(tuple.length);
fread(tuple.data, sizeof(uint8_t), tuple.length, file);
fclose(file);
// Convert tuple data to string
char* string = malloc(tuple.length 1);
for (uint32_t i = 0; i < tuple.length; i ) {
string[i] = tuple.data[i];
}
string[tuple.length] = '\0';
printf("Tuple size: %u bytes\n", tuple.length);
printf("Tuple data: %s\n", string);
return 0;
}
EDIT: After using pahole
to look at compiled struct layout, it seems there is 4 bytes padding inserted between the Tuple
fields for alignment
Maybe the error is to do with this?
struct Tuple {
uint32_t length; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
uint8_t * data; /* 8 8 */
/* size: 16, cachelines: 1, members: 2 */
/* sum members: 12, holes: 1, sum holes: 4 */
/* last cacheline: 16 bytes */
};
CodePudding user response:
You are reading values with aligned layouts from a byte[]
, but the elements of a byte[]
are only assumed/guaranteed to be aligned to 1 byte (i.e. not aligned really). So, reading/writing a value with a layout that has a larger alignment constraint (such as the layouts you're using) will fail with an exception like that.
It likely works in the case where you are writing the data because in that case you're using a native segment. For native segments the 'real' memory address is used to do alignment checking, and the underlying allocator, which is malloc
, typically return 16-byte aligned memory regions. (Though, the Java code only requests 1-byte alignment)
If you want to read from a byte[]
directly like that, you can get around the exception by applying .withBitAlignment(8)
to all the layouts you're using, which will effectively turn off the alignment checking.
CodePudding user response:
I managed to fix it, there were two changes I made:
- I used the
jextract
tool to generate Java code similar to mine, to see where it might differ. It was mostly the same, but they defined C interop types like this:
class Constants {
static final ValueLayout.OfBoolean C_BOOL_LAYOUT = ValueLayout.JAVA_BOOLEAN;
static final ValueLayout.OfByte C_CHAR_LAYOUT = ValueLayout.JAVA_BYTE;
static final ValueLayout.OfShort C_SHORT_LAYOUT = ValueLayout.JAVA_SHORT.withBitAlignment(16);
static final ValueLayout.OfInt C_INT_LAYOUT = ValueLayout.JAVA_INT.withBitAlignment(32);
static final ValueLayout.OfInt C_LONG_LAYOUT = ValueLayout.JAVA_INT.withBitAlignment(32);
static final ValueLayout.OfLong C_LONG_LONG_LAYOUT = ValueLayout.JAVA_LONG.withBitAlignment(64);
static final ValueLayout.OfFloat C_FLOAT_LAYOUT = ValueLayout.JAVA_FLOAT.withBitAlignment(32);
static final ValueLayout.OfDouble C_DOUBLE_LAYOUT = ValueLayout.JAVA_DOUBLE.withBitAlignment(64);
static final ValueLayout.OfAddress C_POINTER_LAYOUT = ValueLayout.ADDRESS.withBitAlignment(64);
}
And I noticed it had inserted 32 bytes of padding between the fields.
So I changed the definition of Tuple
's layout to the below:
public static ValueLayout.OfInt SIZE_FIELD = Constants.C_LONG_LAYOUT.withName("size");
public static ValueLayout.OfAddress DATA_FIELD = Constants.C_POINTER_LAYOUT.withName("data");
public static GroupLayout LAYOUT = MemoryLayout.structLayout(
SIZE_FIELD,
MemoryLayout.paddingLayout(32),
DATA_FIELD
).withName("Tuple");
- Instead of trying to pass in the memory segment from
MemorySegment.ofArray(bytes)
to.deserialize()
, I allocated the memory first and then copied the bytes memory into it:
// Now read the file back in
try (MemorySession session = MemorySession.openConfined()) {
byte[] bytes = Files.readAllBytes(Paths.get("tuple.bin"));
MemorySegment segment = session.allocate(Tuple.LAYOUT);
segment.copyFrom(MemorySegment.ofArray(bytes));
Tuple tuple2 = Tuple.deserialize(segment);
System.out.println(tuple2);
System.out.println(new String(tuple2.data()));
} catch (Exception e) {
throw new RuntimeException(e);
}