If there are 8 core cpu, theoretically OS can write 8 files at the same time with out context switching?
Is there a big difference in speed between writing 8 1gb text files sequentially and at the same time?(using thread or multiprocess)
CodePudding user response:
You have big misconceptions about disk transfers. Disk IO isn't done by the CPU nowadays. It used to when the CPU had to read/write the data to/from an IO port (https://wiki.osdev.org/ATA_PIO_Mode). The ATA PIO mode is the latest (and still supported) hard-disk transfer mode that isn't DMA.
Otherwise, you have a AHCI which is a PCI device that will control hard-disks. The AHCI is a specification for interacting with modern SATA hard-disks (https://www.intel.ca/content/www/ca/en/io/serial-ata/serial-ata-ahci-spec-rev1-3-1.html). It doesn't control NVME disks. NVME are a separate class and must be controlled by a specialized PCI controller that has a different interface (https://wiki.osdev.org/NVMe).
If I'm not wrong, PCI buses are serial. They transfer one bit at a time. They can be very fast and the PCI specification is used in high end servers with new specifications announced every so often by the PCI-SIG group that aim at providing new capabilities (new speed, new mechanisms, etc).
In PCI, you write to some conventional positions in RAM which writes to the PCI registers of the device. This is often called Memory Mapped IO (MMIO). It is certain that it is much more complex than what you say in your question. A CPU with 8 cores doesn't write 8 files simultaneously. Even in a system without DMA, the IO port that is read can only be filed with some data from one specific position on the hard-disk. It cannot be read by 8 cores at once.
I'm not so sure for the AHCI but, normally, the PCI hardware interfaces require almost no locking. They have structures in RAM that must be modified like queues of operations. It is certain that if one core modifies the queue another core must not also touch the same position in the queue. If 8 cores must do file IO, the 8 cores will add their operation to the queue of DMA jobs of the AHCI. I never implemented an AHCI driver but it is stated in the AHCI specification (on chapter 5) that:
The data structures of AHCI are built assuming that each port in an HBA contains its own DMA engine, that is, each port can be executed independently of any other port. This is a natural flow for software, since each device is generally treated as a separate execution thread. It is strongly recommended that HBA implementations proceed in this fashion.
Software presents a list of commands to the HBA for a port, which then processes them. For HBAs that have a command list depth of ‘1’, this is a single step operation, and software only presents a single command. For HBAs that support a command list, multiple commands may be posted.
Software posts new commands received by the OS to empty slots in the list, and sets the corresponding slot bit in the PxCI register. The HBA continuously looks at PxCI to determine if there are commands to transmit to the device.
Here, HBA refers to the silicon implementation of the AHCI specification (the actual chip). I'm not exactly sure what this means but I think it means that several DMA jobs can happen at once but probably not from the same device/hard-disk. Since on most computers there is only one main hard-disk then I guess there isn't going to be 8 DMA jobs at once from the same disk because "there is 8 cores in the CPU".
CodePudding user response:
Os can open many files and write at a time regardless of the cpu cores but the bottleneck is hard-drive. Even if you have 8 core cpu all of your data that you write have to go sequentially to the hard-drive.
However modern OS don't write directly to files so If you open and write 8 files at same time it is going to stored on RAM and puts in queue for writing to hard-drive so whenever hard-drive gets free it will take that data from ram and write it to the disk.