Home > OS >  How does the kernel know the address of a process file table?
How does the kernel know the address of a process file table?

Time:12-02

A file descriptor contains the index of an entry within the process file table. However, the index alone is not enough to locate a particular entry in the [process] file table. Knowledge about the address of the first entry within the table is also required. So, my question is this: How does the kernel, only provided with the file descriptor as an argument in system calls such as read and write, manage to determine the location of the intended entry within the process file table?

I tried to see what happens under the hood by converting the following C code into x86-64 assembly, but all I got was an additional assembly open instruction.

int main(int argc, char* argv[]) {

    FILE* fd = fopen("home/mhdi/miles","r");
    
    return 0;
}
    .file   "open.c"
    .intel_syntax noprefix
    .text
    .section    .rodata
.LC0:
    .string "r"
.LC1:
    .string "home/mhdi/miles"
    .text
    .globl  main
    .type   main, @function
main:
.LFB6:
    .cfi_startproc
    endbr64
    push    rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    mov rbp, rsp
    .cfi_def_cfa_register 6
    sub rsp, 32
    mov DWORD PTR -20[rbp], edi
    mov QWORD PTR -32[rbp], rsi
    lea rax, .LC0[rip]
    mov rsi, rax
    lea rax, .LC1[rip]
    mov rdi, rax
    call    fopen@PLT
    mov QWORD PTR -8[rbp], rax
    mov eax, 0
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE6:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0"
    .section    .note.GNU-stack,"",@progbits
    .section    .note.gnu.property,"a"
    .align 8
    .long   1f - 0f
    .long   4f - 1f
    .long   5
0:
    .string "GNU"
1:
    .align 8
    .long   0xc0000002
    .long   3f - 2f
2:
    .long   0x3
3:
    .align 8
4:

CodePudding user response:

A file descriptor contains the index of an entry within the process file table. However, the index alone is not enough to locate a particular entry in the [process] file table. Knowledge about the address of the first entry within the table is also required. So, my question is this: How does the kernel, only provided with the file descriptor as an argument in system calls such as read and write, manage to determine the location of the intended entry within the process file table?

A file descriptor (for a process) is an integer value that the kernel gives to the process to identify the file in the user file table. As the kernel and the user process doesn't share the same virtual memory space, there must be a means for a process to indicate the kernel that the operation to be done is on one file and not to another (so a process can have several open files at the same time) There's no way for the user process to access the per process file table that the kernel maintains on each process, it is stored in the process' kernel private data, and it is not mapped to the virtual address space of the user process. Historically, it was stored in a per process private area called the u-area, but today the structure contents has changed too much and the contents include, things like the inode used for root directory based searches (the root directory of the process), the current working directory inode for searches based on a curren directory basis, parameters like the user limits for the process (in-core memory limit, max file size, max execution time, max memory to allocate, process umask, user group ids for the process...), and the open file table array (for which index indicates the actual file descriptor of the file), the process session id, the kernel stack for the process when running in kernel mode (in a multithreading operating system, there's also a per thread data structure maintained in the kernel to handle things like the user data cpu registers contents in user mode, etc.)

I tried to see what happens under the hood by converting the following C code into x86-64 assembly, but all I got was an additional assembly open instruction.

What you got was a call to the fopen(3) library routine, not a system call.

To get under the hood, you need to start in the kernel source code, as listing assembly code will lead you until a special (normally, the interface to the kernel is done by means of a special assembler instruction that enforces a software trap, which you will see as a single assembler instruction, but you cannot trace further -- in linux/x86 the instruction is INT 0x80)

In this case, you have dissasembled a code that calls fopen(3) which is not a system call, but a standard library function. That is not the special instruction I mentioned above, but a normal subroutine call. In case you had called open(2) (the actual system call that fopen(3) ends calling) you will see that open is accessed by a similar call open instruction, because all system calls are wrapped into C functions that do some housekeeping to make the parameters available to the system call (in Intel processors the way to call the system is by means of an INT 0x80 assembler instruction by software, that generates a long jump to a trap gate that raises the execution level mode of the processor to 0, and changes the virtual memory mapping, etc, etc) and to process the data coming from the kernel on return (like calling any signal handler in case the system has some pending interrupt handler to be called). But what happens in the kernel will be hidden to you, because it is not accessible to the running process. A system call for a process happens like the execution of a single machine instruction, and like you cannot know what has happened to the cpu state in every stage that happens inside a single instruction execution, you cannot know what has happened in between you executed the INT 0x80 and the next instruction you executed.

  • Related