Computer Vision in iOS – Core Camera

Computer Vision on mobile is fun! Here are the few reasons why I personally love computer vision on mobile when compared to traditional desktop based systems.

  1. You need not have to buy a web camera or high resolution camera which should be connected to computer through USB cable.
  2. You generally connect your webcam through a USB cable, so the application you are designing is restricted for testing only inside the circumference of the circle whose radius == length of the cable 😛 .
  3. If you want your system portable you might have to buy a Raspberry Pi or Arduino and connect your webcam to it for doing some processing on the frames it fetches. (My roommates & besties during my bachelors has done some extensive coding on microprocessors and micro controllers, and I clearly know how hard it is.)
  4. If I want to escape the above mentioned step and still want to make my system portable, I literally have to carry the CPU with me 😛

While discussing about the disadvantages of doing CV algorithms on traditional desktop systems you might be already inferring the advantages of mobile based pipelines. Mobiles are easily portable, it is fully equipped with CPU, GPU and various DSP modules which can be utilised based upon the application, and it has a high resolution camera 😉 The only disadvantage with the current mobile computer vision is that you can’t directly take the algorithm you designed that works almost real-time on a computer on to a mobile and expect the same results. Optimisation plays a key role in mobile computer vision. Mobile battery is limited, hence energy usage of your algorithm matters! If you are designing a heavy CV based system, you can’t schedule the whole operations on CPU. You might need to come up with some new strategies that can reduce the CPU usage!

By halting the discussion I started for no specific reason 🙄 , let us get into the topic that this blog is actually dedicated to 😀 .

In this blog, I will be designing an application using Swift and initialise the camera without using the OpenCV. The main idea has been taken inspiration from the following article by Boris Ohayon. In this blog I am developing over his idea and customising it for the applications that I will be designing in future. At any point of this blog, if you are clueless about the Camera pipeline, you can read the article (link provided above) and follow along this tutorial.

  • Without wasting any more time create a new ‘Single View Application’ with your desired product name and set the language as ‘Swift’.
  • Add an Image View in the Main.storyboard and reference it in ViewController.swift.
  • Create a new file named CameraBuffer.swift and add the following code
import UIKit
import AVFoundation

protocol CameraBufferDelegate: class {
    func captured(image: UIImage)
}

class CameraBuffer: NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
    // Initialise some variables
    private var permissionGranted = false
    private let sessionQueue = DispatchQueue(label: "session queue")

    private var position = AVCaptureDevicePosition.back
    private let quality = AVCaptureSessionPreset640x480
    private let captureSession = AVCaptureSession()
    private let context = CIContext()

    weak var delegate: CameraBufferDelegate?

    override init() {
        super.init()
        checkPermission()
        sessionQueue.async { [unowned self] in
            self.configureSession()
            self.captureSession.startRunning()
        }
    }

    private func checkPermission() {
        switch AVCaptureDevice.authorizationStatus(forMediaType: AVMediaTypeVideo) {
        case .authorized:
            permissionGranted = true
        case .notDetermined:
            requestPermission()
        default:
            permissionGranted = false
        }
    }

    private func requestPermission() {
        sessionQueue.suspend()
        AVCaptureDevice.requestAccess(forMediaType: AVMediaTypeVideo) { [unowned self] granted in
            self.permissionGranted = granted
            self.sessionQueue.resume()
        }
    }

    private func configureSession() {
        guard permissionGranted else { return }
        captureSession.sessionPreset = quality
        guard let captureDevice = selectCaptureDevice() else { return }
        guard let captureDeviceInput = try? AVCaptureDeviceInput(device: captureDevice) else { return }
        guard captureSession.canAddInput(captureDeviceInput) else { return }
        captureSession.addInput(captureDeviceInput)

        do {
            var finalFormat = AVCaptureDeviceFormat()
            var maxFps: Double = 0
            let maxFpsDesired: Double = 0 //Set it at own risk of CPU Usage
            for vFormat in captureDevice.formats {
                var ranges      = (vFormat as AnyObject).videoSupportedFrameRateRanges as!  [AVFrameRateRange]
                let frameRates  = ranges[0]
                
                if frameRates.maxFrameRate >= maxFps && frameRates.maxFrameRate <= maxFpsDesired {
                    maxFps = frameRates.maxFrameRate
                    finalFormat = vFormat as! AVCaptureDeviceFormat
                }
            }
            if maxFps != 0 {
                let timeValue = Int64(1200.0 / maxFps)
                let timeScale: Int32 = 1200
                try captureDevice.lockForConfiguration()
                captureDevice.activeFormat = finalFormat
                captureDevice.activeVideoMinFrameDuration = CMTimeMake(timeValue, timeScale)
                captureDevice.activeVideoMaxFrameDuration = CMTimeMake(timeValue, timeScale)
                captureDevice.focusMode = AVCaptureFocusMode.autoFocus
                captureDevice.unlockForConfiguration()
            }
            print(maxFps)
        }
        catch {
            print("Something was wrong")
        }
        
        let videoOutput = AVCaptureVideoDataOutput()
        videoOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "sample buffer"))
        guard captureSession.canAddOutput(videoOutput) else { return }
        captureSession.addOutput(videoOutput)
        guard let connection = videoOutput.connection(withMediaType: AVFoundation.AVMediaTypeVideo) else { return }
        guard connection.isVideoOrientationSupported else { return }
        guard connection.isVideoMirroringSupported else { return }
        connection.videoOrientation = .portrait
        connection.isVideoMirrored = position == .front
    }
    
    private func selectCaptureDevice() -> AVCaptureDevice? {
        return AVCaptureDevice.defaultDevice(withDeviceType: .builtInWideAngleCamera, mediaType: AVMediaTypeVideo, position: position)
    }
    
    private func imageFromSampleBuffer(sampleBuffer: CMSampleBuffer) -> UIImage? {
        guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return nil }
        let ciImage = CIImage(cvPixelBuffer: imageBuffer)
        guard let cgImage = context.createCGImage(ciImage, from: ciImage.extent) else { return nil }
        return UIImage(cgImage: cgImage)
    }
    
    func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
        guard let uiImage = imageFromSampleBuffer(sampleBuffer: sampleBuffer) else { return }
        DispatchQueue.main.async { [unowned self] in
            self.delegate?.captured(image: uiImage)
        }
    }
}
  • And the ViewController.swift file should like this:
import UIKit

class ViewController: UIViewController, CameraBufferDelegate {

    var cameraBuffer: CameraBuffer!

    @IBOutlet weak var imageView: UIImageView!

    override func viewDidLoad() {
        super.viewDidLoad()
        // Do any additional setup after loading the view, typically from a nib.
        cameraBuffer = CameraBuffer()
        cameraBuffer.delegate = self
    }

    func captured(image: UIImage) {
        imageView.image = image
    }

    override func didReceiveMemoryWarning() {
        super.didReceiveMemoryWarning()
        // Dispose of any resources that can be recreated.
    }

}
  • After this, you can build and run the app on your mobile and see that it works like a charm.
  • So what new thing(s) did I implement in the above code? I converted the code into Swift 3.0 supporting format, added a block through which you can set your FPS from 30 to as high as 240. And did rigorous tests to make sure that the camera pipeline will never go beyond the 10% CPU Usage on the iPhone for any realistic application.
  • If your application needs higher FPS, you can set it by changing the variable ‘maxFPSDesired. But change it only if you need FPS greater than 30. By default, the FPS will be between 24-30 (fluctuating) and if you want to force the FPS to a fixed number, it won’t be exactly equal to the number you fix and also the CPU usage increases drastically. But if the application you want to try doesn’t have any other costly computations, you can play with higher FPS.
  • How to count FPS of your app? You can go fancy and code the FPS counter to use in your app. I would suggest you to run the app in profiling mode and choose ‘Core Animation’ in Instruments to check the FPS of your app 😉

2 thoughts on “Computer Vision in iOS – Core Camera

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s