Occasional crash in Swift async/await concurrency code - only in release builds-CodePudding

I'm hitting an occasional crash in some code which uses Swift's new concurrency features. This crash never seems to happen on development builds, either in the simulator or when I install the code on a device directly from Xcode. However it's happening pretty frequently when folks install the code from TestFlight.

The actual crash is this:

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000040
Exception Codes: 0x0000000000000001, 0x0000000000000040

Thread 6 name:
Thread 6 Crashed:
0   libswift_Concurrency.dylib          0x00000001e6df7440 swift::TaskGroup::offer(swift::AsyncTask*, swift::AsyncContext*)   504 (TaskGroup.cpp:603)
1   libswift_Concurrency.dylib          0x00000001e6df3c5c swift::AsyncTask::completeFuture(swift::AsyncContext*)   132 (Task.cpp:180)
2   libswift_Concurrency.dylib          0x00000001e6df54d0 completeTaskAndRelease(swift::AsyncContext*, swift::SwiftError*)   128 (Task.cpp:323)
3   libswift_Concurrency.dylib          0x00000001e6df5425 completeTaskWithClosure(swift::AsyncContext*, swift::SwiftError*)   1 (Task.cpp:365)

I'm not sure exactly what line(s) of code this corresponds with, but it's likely somewhere in here:

class MyClass {

    var results: [String: Set<String>] = [:]

    func getSingleResult(index: Int) async -> (String, Set<String>) {
        // ....
    }

    func getAllResults(range: ClosedRange<Int>) async -> [String: Set<String>] {
        await withTaskGroup(
                of: (String, Set<String>).self,
                returning: [String: Set<String>].self
            ) { [self] group in
                for i in range {
                    group.addTask { await getSingleResult(index: i) }
                }

                var results: [String: Set<String>] = [:]

                for await result in group {
                    results[result.0] = result.1
                }

                return results
            }
    }
    
    func blockingDoWork(range: ClosedRange<Int>) {
        results = [:]

        let dispatchGroup = DispatchGroup()
        dispatchGroup.enter()

        Task.init {
            results = await getAllResults(range: range)
            dispatchGroup.leave()
        }
        
        dispatchGroup.wait()

        // Now do something with `results`
    }

I'm trying to bridge the divide between synchronous and asynchronous code (perhaps the wrong way). Basically I have a single thread/synchronous code which then creates a variable number of asynchronous calls and attempts to block until all those calls finish. It aggregates the results of those calls in the results class member, which perhaps isn't the best approach but was the only way I could get the asynchronous code to communicate with the synchronous.

This code seems to run fine in a development build, and runs 1000s of times in a release build but then crashes.

I can't seem to turn on the thread sanitizer or address sanitizer because my Xcode project uses Swift Package Manager, and there's a bug when using those two which causes the build to fail.

Any ideas what might be going wrong? I assume I'm getting lucky with the development builds and that I've got some fundamental problem in this code, but I don't know enough about the subtleties of Swift's new concurrency features to recognize it.

CodePudding user response：

You cannot use semaphores in conjunction with async-await. See Swift concurrency: Behind the scenes:

[Primitives] like semaphores ... are unsafe to use with Swift concurrency. This is because they hide dependency information from the Swift runtime, but introduce a dependency in execution in your code. Since the runtime is unaware of this dependency, it cannot make the right scheduling decisions and resolve them. In particular, do not use primitives that create unstructured tasks and then retroactively introduce a dependency across task boundaries by using a semaphore or an unsafe primitive. Such a code pattern means that a thread can block indefinitely against the semaphore until another thread is able to unblock it. This violates the runtime contract of forward progress for threads.

You might consider testing with the LIBDISPATCH_COOPERATIVE_POOL_STRICT environment variable as discussed here, in the same video.

You ask:

I'm trying to bridge the divide between synchronous and asynchronous code (perhaps the wrong way).

You should refactor the code that calls this synchronous method to adopt asynchronous pattern, and then excise all blocking API (e.g., semaphore wait, dispatch group wait, etc.). Those were anti-patterns in the GCD world and are to be avoided within Swift concurrency. I understand why developers who are unfamiliar with asynchronous programming are so attracted to those synchronous anti-patterns, but it has always been a mistake, and should be excised from one’s code.

Bottom line, in Swift concurrency one must “maintain a runtime contract that threads are always able to make forward progress.” Just embrace asynchronous patterns (i.e., stay within async-await without any old-school thread-blocking techniques) and you should be good.

FWIW, the Swift concurrency: Update a sample app shows interesting techniques for incrementally updating an old app. E.g., mark this blocking method as deprecated, and then the compiler will warn you where it is called and you can direct your refactoring efforts to those offending routines.