I have this simple library
lib.h
:
int lib()
lib.c
:
#include <stdio.h>
#include <dlfcn.h>
#define VK_NO_PROTOTYPES
#include <vulkan/vulkan.h>
PFN_vkGetInstanceProcAddr vkGetInstanceProcAddr;
PFN_vkEnumerateInstanceLayerProperties vkEnumerateInstanceLayerProperties;
int lib()
{
void *lib = dlopen("libvulkan.so.1", RTLD_NOW);
vkGetInstanceProcAddr = dlsym(lib, "vkGetInstanceProcAddr");
vkEnumerateInstanceLayerProperties = (PFN_vkEnumerateInstanceLayerProperties)vkGetInstanceProcAddr(NULL, "vkEnumerateInstanceLayerProperties");
uint32_t count;
vkEnumerateInstanceLayerProperties(&count, NULL);
printf("%d\n", count);
return 0;
}
I compile it to a shared library using
libabc.so: lib.o
$(CC) -shared -o $@ $^ -ldl
lib.o: lib.c lib.h
$(CC) -fPIC -g -Wall -c -o $@ $<
But when I use this library in an application I get a segfault when vkEnumerateInstanceLayerProperties
is called on line 18.
What's more, if I change the name vkEnumerateInstanceLayerProperties
to something else, say test
, then everything works just fine and (in my system) 6
is printed. It also works if I don't use a dynamic library at all, i.e. I compile lib.c
together with main.c
directly without -fPIC
.
What is causing this and how do I resolve it?
CodePudding user response:
The problem is that these two definitions:
PFN_vkGetInstanceProcAddr vkGetInstanceProcAddr;
PFN_vkEnumerateInstanceLayerProperties vkEnumerateInstanceLayerProperties;
define global symbols named vkGetInstanceProcAddr
and vkEnumerateInstanceLayerProperties
in lib.so
.
These definitions override the ones inside libvulkan
, and so the vkGetInstanceProcAddr(NULL, "vkEnumerateInstanceLayerProperties");
call returns the definition inside lib.so
, instead of the intended one inside libvulcan.so.1
. And that symbol is not callable (is in the .bss
section), so attempt to call it (naturally) produces a SIGSEGV
.
To fix this, either make these symbols static
, or name them differently, e.g. p_vkGetInstanceProcAddr
and p_vkEnumerateInstanceLayerProperties
.
Update:
Why compiling lib.c together with main.c directly (without an intermediate shared library inbetween) works?
Because symbols are (by default) not exported from an executable in the dynamic symbol table, unless some shared library references them.
You can change the default by adding -Wl,--export-dynamic
(which causes the main executable to export all non-local symbols) to the main executable link line. If you do so, linking lib.c
with main.c
will also fail.
Also how can vkGetInstanceProcAddr
"capture" the
vkEnumerateInstanceLayerProperties` in lib.so?
By using normal symbol resolution rules -- the first ELF binary to define the symbol wins.
Shouldn't it just return some kind of predefined address that points to the correct function? I imagine that it is implemented with something like
if (!strcmp(...)) return vkGetInstanceProcAddr_internal
.
If it were implemented this way, it would have worked.
The implementation I can find doesn't do the ..._internal
part:
void *globalGetProcAddr(const char *name) {
if (!name || name[0] != 'v' || name[1] != 'k') return NULL;
name = 2;
if (!strcmp(name, "CreateInstance")) return vkCreateInstance;
if (!strcmp(name, "EnumerateInstanceExtensionProperties")) return vkEnumerateInstanceExtensionProperties;
...
Arguably that is an implementation bug -- it should return the address of a local alias (the ..._internal
symbol) and be immune to symbol overriding.