I'm working on a procfs kernel extension for macOS and trying to implement a feature that emulates Linux’s /proc/cpuinfo similar to what FreeBSD does with its linprocfs. Since I'm trying to learn, and since not every bit of FreeBSD code can simply be copied over to XNU and be expected to work right out of the jar, I'm writing this feature from scratch, with FreeBSD and NetBSD's linux-based procfs features as a reference. Anyways...
Under Linux, $cat /proc/cpuinfo showes me something like this:
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 33
model name : AMD Ryzen 9 5950X 16-Core Processor
stepping : 0
microcode : 0xa201016
cpu MHz : 2195.107
cache size : 512 KB
physical id : 0
siblings : 32
core id : 0
cpu cores : 16
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips : 6787.02
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
I’m using XNU’s i386_cpu_info structure in i386/cpuid.h to accomplish most of this, but so far I have not been able to get the ‘flags’ field to display correctly.
XNU’s i386/cpuid.h defines each feature flag as such:
#define CPUID_FEATURE_FPU _Bit(0) /* Floating point unit on-chip */
#define CPUID_FEATURE_VME _Bit(1) /* Virtual Mode Extension */
#define CPUID_FEATURE_DE _Bit(2) /* Debugging Extension */
#define CPUID_FEATURE_PSE _Bit(3) /* Page Size Extension */
#define CPUID_FEATURE_TSC _Bit(4) /* Time Stamp Counter */
...
And the i386_cpu_info structure has a field that should correspond to these definitions:
typedef struct i386_cpu_info {
...
uint64_t cpuid_features;
...
}
In my ‘main’ do_cpuinfo() function, for example, I use one of these to check if the FPU feature is supported, as such:
/*
* Check if the FPU feature is present.
*/
char *fpu, *fpu_exception;
/*
* The cpuid_info() function sets up the i386_cpu_info structure and returns a pointer to the structure.
*/
if (cpuid_info()->cpuid_features & CPUID_FEATURE_FPU) {
fpu = "yes";
fpu_exception = "yes";
} else {
fpu = "no";
fpu_exception = "no";
}
And it works just as expected. However once I start adding arrays into the mix, things start getting erratic.
I set up a char array containing the string for each flag as such:
const char *feature_flags[] = {
/* 1 */ "fpu",
/* 2 */ "vme",
/* 3 */ "de",
/* 4 */ "pse",
/* 5 */ "tsc”,
…
}
Then set up a correspnding uint64_t array containing each flag as defined in i386/cpuid.h:
uint64_t feature_list[] = {
/* 1 */ CPUID_FEATURE_FPU,
/* 2 */ CPUID_FEATURE_VME,
/* 3 */ CPUID_FEATURE_DE,
/* 4 */ CPUID_FEATURE_PSE,
/* 5 */ CPUID_FEATURE_TSC,
…
}
Then I wrote a function that is supposed to iterate over these arrays to check if a feature is present with the same method I used for the FPU detection, but instead of doing something like “cpuid_info()->cpuid_features & CPUID_FEATURE_FPU” for each flag, I want cpuid_info()->cpuid_features to get each flag from the feature_list array and, if supported, copy it to the flags variable, move the resulting string to the static ret variable, so we can then free the allocated memory and return ret.
char *
get_cpu_flags(void)
{
int i = 0;
char *flags = NULL;
static char *ret;
/*
* Allocate memory for our flag strings.
*/
flags = _MALLOC(sizeof(feature_flags), M_TEMP, M_WAITOK);
do {
/*
* If the CPU supports a feature in the feature_list[],
* move its corresponding flag from the feature_flags[]
* into the buffer.
*/
if ((cpuid_info()->cpuid_features & feature_list[i]) == feature_list[i]) {
strlcat(flags, feature_flags[i], strlen(feature_flags[i]));
// | | |
// | | * The length of the string I want to amend to the ‘flags’ variable
// | * The flag string in the array I want to amend to the ‘flags’ variable.
// * The variable I want the flag string to be amended to.
}
/*
* Move the flag strings to a static variable before freeing the allocated memory
* so we can free it before returning the resulting string.
*/
ret = flags;
/*
* Add 1 to the counter for each iteration.
*/
i ;
/*
* If the counter exceeds the number of items in the array,
* break the loop.
*/
if (i > nitems(feature_flags)) {
/*
* Free the allocated memory before breaking the loop.
*/
_FREE(&feature_flags, M_TEMP);
break;
} else {
continue;
}
} while (i < nitems(feature_flags));
return ret;
}
This function then gets called by the main do_cpuinfo() function that prints the info into userspace. To save space I’m providing a minimal example here dealing only with the ‘flags’ field:
int
procfs_docpuinfo(__unused procfsnode_t *pnp, uio_t uio, __unused vfs_context_t ctx)
{
vm_offset_t pageno, uva, kva;
int len = 0, xlen = 0;
off_t page_offset = 0;
size_t buffer_size = 0;
char *buffer;
uint32_t max_cpus = *_processor_count;
uint32_t cnt_cpus = 0;
/*
* Set up the variables required to move our data into userspace.
*/
kva = VM_MIN_KERNEL_ADDRESS; // kernel virtual address
uva = uio_offset(uio); // user virtual address
pageno = trunc_page(uva); // page number
page_offset = uva - pageno; // page offset
buffer_size = sizeof(i386_cpu_info_t) (LBFSZ * 2); // buffer size
buffer = _MALLOC(buffer_size, M_TEMP, M_WAITOK); // buffer
char *flags = get_cpu_flags();
do {
if (cnt_cpus <= max_cpus) {
/*
* len should snprintf our flags via uiomove.
*/
len = snprintf(buffer, buffer_size, "flags\t\t\t: %s\n", flags);
/*
* Subtract the uva offset from len.
*/
xlen = len - uva;
xlen = imin(xlen, uio_resid(uio));
/*
* Copy our data into userspace.
*/
uiomove(buffer, xlen, uio);
/*
* Set len back to 0 before entering into the next loop.
*/
if (len != 0) {
len = 0;
}
/*
* Update the CPU counter.
*/
cnt_cpus ;
/*
* Continue unless the counter exceeds the
* available processor count.
*/
continue;
} else if (cnt_cpus > max_cpus) {
/*
* If the counter exceeds the processor count,
* free the associated memory and break the loop.
*/
_FREE(&buffer, M_TEMP);
break;
}
} while (cnt_cpus < max_cpus);
return 0;
}
However, this is the result I get when executing cat on cpuinfo (with the full function intact obviously and not the minimal example provided above):
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 1
model name : AMD Ryzen 9 5950X 16-Core Processor
microcode : 186
stepping : 0
cpu MHz : 3393.62
cache size : 512 KB
physical id : 0
siblings : 32
core id : 0
cpu cores : 16
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpappclm syscalpre
bugs :
bogomips : 6786.62
TLB size : 2560 4K pages
clflush_size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management:
As you can see, the flags field seems to combine fragments of the flag strings together in some very minimalistic fashion rather than the full flags string.
Note: The space you see is the result of there being four categories of features (cpuid_features, cpuid_extfeatures, cpuid_leaf7_features and cpuid_leaf7_extfeatures) in XNU so I made a function like get_cpu_flags() for each category. As such in my main code the snprintf function is expecting four sets of strings with a space between them ("flags\t\t\t: %s %s %s %s\n", cpuflags, cpuextflags, leaf7flags, leaf7extflags), yet it only prints out two of them and neither is printing out correctly once in userspace.
I assume the issue arises within the get_cpu_flags() function and its sister functions, but I’m not sure what the matter actually is. If I move ‘ret = flags’ out of the loop and put _FREE(&feature_flags, M_TEMP) right after it, I get a kernel panic and this is the stack trace:
panic(cpu 18 caller 0xffffff8019a9b8c6): "address 0xffffff7fb66cd110 inside vm entry 0xffffff802abc21e0 [0xffffff7f9a410000:0xffffff8000000000), map 0xffffff802abb10f8"com.apple./System/Volumes/Data/SWE/macOS/BuildRoots/36806d33d2/
Library/Caches/com.apple.xbs/Sources/xnu/xnu-7195.141.19/osfmk/kern/kalloc.c:651
Backtrace (CPU 18), Frame : Return Address
0xffffffa0ef283690 : 0xffffff8019a8c26d mach_kernel : _handle_debugger_trap 0x3fd
0xffffffa0ef2836e0 : 0xffffff8019bd3993 mach_kernel : _kdp_i386_trap 0x143
0xffffffa0ef283720 : 0xffffff8019bc3f8a mach_kernel : _kernel_trap 0x55a
0xffffffa0ef283770 : 0xffffff8019a30a2f mach_kernel : _return_from_trap 0xff
0xffffffa0ef283790 : 0xffffff8019a8ba8d mach_kernel : _DebuggerTrapWithState 0xad
0xffffffa0ef2838b0 : 0xffffff8019a8bd83 mach_kernel : _panic_trap_to_debugger 0x273
0xffffffa0ef283920 : 0xffffff801a29c8da mach_kernel : _panic 0x54
0xffffffa0ef283990 : 0xffffff8019a9b8c6 mach_kernel : _ipc_thread_port_unpin 0x116
0xffffffa0ef2839c0 : 0xffffff8019a9bf33 mach_kernel : _kfree 0x263
0xffffffa0ef283a10 : 0xffffff8019a9be3f mach_kernel : _kfree 0x16f
0xffffffa0ef283a70 : 0xffffff7fb66c5713 com.stupid.filesystems.procfs : _get_cpu_flags 0xe3
0xffffffa0ef283aa0 : 0xffffff7fb66c4f4f com.stupid.filesystems.procfs : _procfs_docpuinfo 0x24f
0xffffffa0ef283d50 : 0xffffff7fb66ca4c0 com.stupid.filesystems.procfs : _procfs_vnop_read 0xb0
0xffffffa0ef283d90 : 0xffffff8019d4449c mach_kernel : _utf8_normalizeOptCaseFoldAndMatchSubstring 0x72c
0xffffffa0ef283e30 : 0xffffff801a04c3b8 mach_kernel : _read_nocancel 0x328
0xffffffa0ef283ee0 : 0xffffff801a04c145 mach_kernel : _read_nocancel 0xb5
0xffffffa0ef283f40 : 0xffffff801a13ed0e mach_kernel : _unix_syscall64 0x2ce
0xffffffa0ef283fa0 : 0xffffff8019a311f6 mach_kernel : _hndl_unix_scall64 0x16
Kernel Extensions in backtrace:
com.stupid.filesystems.procfs(1.0)[89308435-3658-3ED4-990A-F8AF63358857]com.apple.0xffffff7fb66c3000-com.apple.driver.0xffffff7fb66ccfff
For the record, I’m just an amateur trying to learn C and kernel programming on my own because it’s something that has fascinated me for a long time. I’m fairly new to working with character arrays and memory allocation so any advice would be deeply appreciated. I asked a similar question before, but got pointed out to me that I should be more specific and provide more extensive examples. Then I was also exclusively struggling with kernel panics but now that doesn’t seem to be the main issue so I’ve deleted my old one and filed this new question based on the tips I received before and my progress since then. I hope I managed to present my question better this time but, if not, please let me know and I’ll try to improve.