Remove AVAssetWriter's First Black/Blank Frame-CodePudding

I have an avassetwriter to record a video with an applied filter to then play back via avqueueplayer.

My issue is, on play back, the recorded video displays a black/blank screen for the first frame. To my understanding, this is due to the writer capturing audio before capturing the first actual video frame.

To attempt to resolve, I had placed a boolean check when appending to the audio writer input whether the first video frame was appended to the adapter. That said, I still saw a black frame on playback despite having printed out the timestamps, which showed video having preceded audio...I also tried to put a check to start the write session when output == video, but ended up with the same result.

Any guidance or other workaround would be appreciated.

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        
let timestamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer).seconds

if output == _videoOutput {
    if connection.isVideoOrientationSupported { connection.videoOrientation = .portrait }
        
    guard let cvImageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
    let ciImage = CIImage(cvImageBuffer: cvImageBuffer)

    guard let filteredCIImage = applyFilters(inputImage: ciImage) else { return }
    self.ciImage = filteredCIImage

    guard let cvPixelBuffer = getCVPixelBuffer(from: filteredCIImage) else { return }
    self.cvPixelBuffer = cvPixelBuffer
        
    self.ciContext.render(filteredCIImage, to: cvPixelBuffer, bounds: filteredCIImage.extent, colorSpace: CGColorSpaceCreateDeviceRGB())
        
    metalView.draw()
}
        
switch _captureState {
case .start:
    
    guard let outputUrl = tempURL else { return }
    
    let writer = try! AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
    
    let videoSettings = _videoOutput!.recommendedVideoSettingsForAssetWriter(writingTo: .mp4)
    let videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoSettings)
    videoInput.mediaTimeScale = CMTimeScale(bitPattern: 600)
    videoInput.expectsMediaDataInRealTime = true
    
    let pixelBufferAttributes = [
        kCVPixelBufferCGImageCompatibilityKey: NSNumber(value: true),
        kCVPixelBufferCGBitmapContextCompatibilityKey: NSNumber(value: true),
        kCVPixelBufferPixelFormatTypeKey: NSNumber(value: Int32(kCVPixelFormatType_32ARGB))
    ] as [String:Any]
    
    let adapter = AVAssetWriterInputPixelBufferAdaptor(assetWriterInput: videoInput, sourcePixelBufferAttributes: pixelBufferAttributes)
    if writer.canAdd(videoInput) { writer.add(videoInput) }
                            
    let audioSettings = _audioOutput!.recommendedAudioSettingsForAssetWriter(writingTo: .mp4) as? [String:Any]
    let audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioSettings)
    audioInput.expectsMediaDataInRealTime = true
    if writer.canAdd(audioInput) { writer.add(audioInput) }

    _filename = outputUrl.absoluteString
    _assetWriter = writer
    _assetWriterVideoInput = videoInput
    _assetWriterAudioInput = audioInput
    _adapter = adapter
    _captureState = .capturing
    _time = timestamp
                
    writer.startWriting()
    writer.startSession(atSourceTime: CMTime(seconds: timestamp, preferredTimescale: CMTimeScale(600)))
    
case .capturing:
    
    if output == _videoOutput {
        if _assetWriterVideoInput?.isReadyForMoreMediaData == true {
            let time = CMTime(seconds: timestamp, preferredTimescale: CMTimeScale(600))
            _adapter?.append(self.cvPixelBuffer, withPresentationTime: time)

            if !hasWrittenFirstVideoFrame { hasWrittenFirstVideoFrame = true }
        }
    } else if output == _audioOutput {
        if _assetWriterAudioInput?.isReadyForMoreMediaData == true, hasWrittenFirstVideoFrame {
            _assetWriterAudioInput?.append(sampleBuffer)
        }
    }
    break
    
case .end:
    
    guard _assetWriterVideoInput?.isReadyForMoreMediaData == true, _assetWriter!.status != .failed else { break }
    
    _assetWriterVideoInput?.markAsFinished()
    _assetWriterAudioInput?.markAsFinished()
    _assetWriter?.finishWriting { [weak self] in
        
        guard let output = self?._assetWriter?.outputURL else { return }
        
        self?._captureState = .idle
        self?._assetWriter = nil
        self?._assetWriterVideoInput = nil
        self?._assetWriterAudioInput = nil
        
        
        self?.previewRecordedVideo(with: output)
    }
    
default:
    break
}
}

CodePudding user response：

It's true that in the .capturing state you make sure the first sample buffer written is a video sample buffer by discarding preceding audio sample buffers - however you are still allowing an audio sample buffer's presentation timestamp to start the timeline with writer.startSession(atSourceTime:). This means your video starts with nothing, so not only do you briefly hear nothing (which is hard to notice) you also see nothing, which your video player happens to represent with a black frame.

From this point of view, there are no black frames to remove, there is only a void to fill. You can fill this void by starting the session from the first video timestamp.

This can be achieved by guarding against non-video sample buffers in the .start state, or less cleanly by moving writer.startSession(atSourceTime:) into if !hasWrittenFirstVideoFrame {} I guess.

p.s. why do you convert back and forth between CMTime and seconds? Why not stick with CMTime?