Why does sleeping before a write slow down the subsequent write?-CodePudding

I'm seeing occasionally slow performance in writing to a file (or socket) and in debugging, I ended up with a small test harness to reproduce it. I'm seeing strange behavior whereby the longer I sleep in between writes, the longer the write itself takes. I reproduced the test harness in both Go and Python and see similar results in both. I also see similar results on both my mac and linux machines.

The key confusion is that the timing code in both cases looks like

sleep(x)
startTime = now()
write(msg)
endTime = now()
writeTime = endTime - startTime

where writeTime gets significantly longer as the sleep increases, despite the sleep not being included in the timing at all!

Python:

import time

f = open("/dev/null", "w")

msg = "0" * 1024


for sleepMicros in [0, 1, 10, 100, 1000]:
    totalSendTime = 0
    cntOver10us = 0
    for _ in range(10000):
        time.sleep(sleepMicros / 1000 / 1000)
        startSendTime = time.time()
        f.write(msg)
        sentTime = time.time()

        timeInSend = sentTime - startSendTime
        totalSendTime  = timeInSend
        if timeInSend > 10 / 1000 / 1000:
            cntOver10us  = 1

    print(f"sleep: {sleepMicros}us\tTotal time in send: {totalSendTime * 1000}ms\tCount over 10us: {cntOver10us}")

Go:

package main

import (
    "fmt"
    "os"
    "time"
)

func main() {
    f, _ := os.Create("/dev/null")

    msg := make([]byte, 1024)

    for _, sleepMicros := range []time.Duration{0, 1, 10, 100, 1000} {
        totalSendTime := time.Duration(0)
        cntOver10us := 0
        sleepDuration := sleepMicros * time.Microsecond
        for i := 0; i < 10000; i   {
            time.Sleep(sleepDuration)
            startSendTime := time.Now()
            _, err := f.Write(msg)
            sentTime := time.Now()

            if err != nil {
                fmt.Println("SendMessage got error", err)
            }
            timeInSend := sentTime.Sub(startSendTime)
            totalSendTime  = timeInSend
            if timeInSend > 10*time.Microsecond {
                cntOver10us  
            }
        }
        fmt.Println("sleep:", sleepDuration, "\tTotal time in send:", totalSendTime, "\tCount over 10us:", cntOver10us)
    }
}

Results:

Python:

(venv) % python write_to_devnull_mini.py
sleep: 0us  Total time in send: 2.109527587890625ms Count over 10us: 0
sleep: 1us  Total time in send: 1.9390583038330078ms    Count over 10us: 0
sleep: 10us Total time in send: 1.9574165344238281ms    Count over 10us: 0
sleep: 100us    Total time in send: 5.476236343383789ms Count over 10us: 9
sleep: 1000us   Total time in send: 50.98319053649902ms Count over 10us: 1179

Go:

(venv) % go run write_to_devnull_mini.go
sleep: 0s   Total time in send: 5.724469ms  Count over 10us: 0
sleep: 1µs  Total time in send: 6.404582ms  Count over 10us: 1
sleep: 10µs     Total time in send: 4.346918ms  Count over 10us: 0
sleep: 100µs    Total time in send: 10.909356ms     Count over 10us: 8
sleep: 1ms  Total time in send: 112.627213ms    Count over 10us: 1307

My best guess is that this is some artifact of the scheduler, but would love to understand what's happening. How does sleeping for 1ms in between writes cause the total time spent between calls to time.Now() to go up by 20x, and the number of "slow" (>10us) writes to increase 200-fold?

CodePudding user response：

You omitted details like core count, scheduler, and loading from other processes.

It could be that frequent sleeping is reducing your dynamic priority. That's different from the user-set nice value. Your process is predicted to use only a few cycles per second, so it's OK to pre-empt the cPython bytecode interpreter with a higher priority process that has a tighter deadline.

But it could also be that frequent long sleeps are allowing other processes to dirty the cache so you suffer more cache misses. Or, similarly, you lack affinity and keep getting rescheduled on different cores. Instrument the cache, or look at PSR reported by ps if you want to investigate that.

A simple thing to look at would be

import resource
...
r = resource.getrusage(resource.RUSAGE_SELF)

and note patterns in how voluntary and r.ru_nivcsw involuntary context switches increment. Also, verify that r.ru_minflt remains constant.

Hmmm, come to think of it, slowness due to L3 cache miss should show up as non-constant user and sys mode deltas reported by getrusage().

CodePudding user response：

Wait - this looks suspicious:

totalSendTime := time.Duration(0)
cntOver10us := 0
for _, sleepMicros := range []time.Duration{0, 1, 10, 100, 1000} {
    sleepDuration := sleepMicros * time.Microsecond
    for i := 0; i < 10000; i   {
        ...
    }
    fmt.Println("sleep:", sleepDuration, "\tTotal time in send:", totalSendTime, "\tCount over 10us:", cntOver10us)
}

Don't you really want to reset totalSendTime and cntOver10us on each iteration? Otherwise, totalSleep and cntOver10us is going to keep going up no matter how long you sleep.

for _, sleepMicros := range []time.Duration{0, 1, 10, 100, 1000} {
    totalSendTime := time.Duration(0)
    cntOver10us := 0
    sleepDuration := sleepMicros * time.Microsecond
    for i := 0; i < 10000; i   {
        ...
    }
    fmt.Println("sleep:", sleepDuration, "\tTotal time in send:", totalSendTime, "\tCount over 10us:", cntOver10us)
}

Otherwise, your totalSendTime for each outer loop is only going to go up since you never seem to reset it