Does WASM text format have structs?
(module
(type (; can a type be a `struct` like in C or rust? ;) )
(; rest of module ;)
)
I've compiled the following c to wasm using this WasmExplorer tool
struct MyStruct {
int MyField;
long MyOtherField;
};
MyStruct returnMyStruct(int myField){
return MyStruct {
MyField: myField,
MyOtherField: myField * 2
};
}
It outputs the following, but I'm having trouble understanding what the WASM is doing.
(module
(table 0 anyfunc)
(memory $0 1)
(export "memory" (memory $0))
(export "_Z14returnMyStructi" (func $_Z14returnMyStructi))
(func $_Z14returnMyStructi (; 0 ;) (param $0 i32) (param $1 i32)
(i32.store
(get_local $0)
(get_local $1)
)
(i32.store offset=4
(get_local $0)
(i32.shl
(get_local $1)
(i32.const 1)
)
)
)
)
The function generated does not have a return type, it uses i32.store
and i32.shl
along with an offset. Is it storing the struct in memory somewhere?
An explanation of how and why this works would be much appreciated.
CodePudding user response:
Does WASM text format have structs?
It does not. Like with other low-level assembly languages, wasm only has a few integer data types, and views memory as a big block of bytes. This is a simplification, but when a high level language like C is compiled to assembly, struct variables are allocated a spot in memory, with each field laid out at a different address. When you write to a field, it:
- Takes the address of the struct variable
- Adds the offset of the field from the root of the struct
- Writes to the resulting address
The function generated does not have a return type, it uses i32.store and i32.shl along with an offset. Is it storing the struct in memory somewhere?
What you're observing is a C feature, Return Value Optmization (RVO). Compilers are required since C 11 to avoid making extra copies of a PR-value struct (e.g. a temporary expression) getting returned from a function. While the standard doesn't dictate how to do that, many compilers accomplish this by converting the return value to an output parameter, e.g. this:
MyStruct myFunc(int);
MyStruct myStruct;
myStruct = myFunc(42);
gets converted to this:
void myFunc(MyStruct&, int);
MyStruct myStruct;
myFunc(myStruct, 42);
Now look again at the function signature:
(func $_Z14returnMyStructi (; 0 ;) (param $0 i32) (param $1 i32)
There are two parameters:
$0
is the address of a MyStruct, where the return value will be written$1
ismyField
.
So this instruction:
(i32.store
(get_local $0)
(get_local $1)
)
Writes myField
to the output address. In this case, the MyField
member of MyStruct
sits at offset zero, and is getting written.
And this instruction:
(i32.store offset=4
(get_local $0)
(i32.shl
(get_local $1)
(i32.const 1)
)
)
i32.shl
shifts myField
left by 1 bit, effectively multiplying it by two. The result is written to an address 4 bytes after the output address. Since MyOtherField
is laid out 4 bytes from the root of MyStruct
, this is writing to MyOtherField
.