Video Recording iOS

From Kudan AR Engine
Jump to: navigation, search


This tutorial outlines the steps necessary for setting up the KudanAR engine to render 3D content and camera stream to a video file on iOS.

It is assumed that a basic iOS project with the KudanAR framework has already been set up. See our project setup and marker basics tutorials for help in setting up a new project.


KudanAR uses OpenGL to render content and video on iOS. OpenGL draws content to framebuffers, which can be used to draw content directly to screen or for more complex operations, such as rendering content offscreen to be used as textures within 3D scenes.

In order to render a scene to a video file, image data for each frame needs to be fetched after it has been drawn to our main framebuffer, which is responsible for drawing content to device screens, and encoded into a compatible video format. This traditionally is achieved by reading pixel data directly from the main framebuffer, encoding the data and writing it directly to disk. However, this method involves significant amounts of data transfer from the GPU and can significantly detract from performance when recording high resolution videos.

Fortunately, iOS provides a method of recording that removes the need to read data from the GPU by allowing content to be rendered directly into the video encoding pipeline. This is achieved through the use of the CoreVideo library and AssetWriter class, which can accept an OpenGL texture as an input to the video encoding pipeline. Rendering a scene to this offscreen texture allows the direct transfer of data into the video encoding pipeline, removing the pressure on GPU data transfer.

Implementing the ARCaptureRenderTarget Class

The ability to capture KudanAR content to a video can be achieved by extending the ARRenderTarget class and rendering the content of an AR scene to this subclass.

Start by creating the subclass header file, ARCaptureRenderTarget.h:

#import <KudanAR/KudanAR.h>
@interface ARCaptureRenderTarget : ARRenderTarget
@property (nonatomic, strong) AVAssetWriter *assetWriter;
@property (nonatomic, strong) AVAssetWriterInputPixelBufferAdaptor *assetWriterPixelBufferInput;
@property (nonatomic, strong) NSDate *startDate;
@property (nonatomic) CVPixelBufferRef pixelBuffer;
@property (nonatomic) BOOL isVideoFinishing;
@property (nonatomic) GLuint fbo;

Then implement the ARCaptureRenderTarget class by overriding the ARRenderTarget default initialisation method:

- (instancetype) initWithWidth:(float)width height:(float)height
    self = [super initWithWidth:width height:height];
    if (self) {
        // Account for the scaling factor associated with some iOS devices.
        self.width *= [UIScreen mainScreen].scale;
        self.height *= [UIScreen mainScreen].scale;
        // Set up the required render target assets.
        [self setupAssetWriter];
        [self setupFBO];
        _isVideoFinishing = NO;
    return self;

It is then necessary to create and set up the AVAssetWriter object responsible for receiving frames from the AR scene and creating the video file:

- (void)setupAssetWriter {
    // Set up asset writer.
    NSError *outError;
    // Write the video file to the application's library directory, with the name "video.mp4".
    NSURL *libsURL = [[[NSFileManager defaultManager] URLsForDirectory:NSLibraryDirectory inDomains:NSUserDomainMask] lastObject];
    NSURL *outputURL = [libsURL URLByAppendingPathComponent:@"video.mp4"];
    // Delete a file with the same path if one exists.
    if ([[NSFileManager defaultManager] fileExistsAtPath:[outputURL path]]){
        [[NSFileManager defaultManager] removeItemAtURL:outputURL error:nil];
    _assetWriter = [AVAssetWriter assetWriterWithURL:outputURL
    if (outError) {
        NSAssert(NO, @"Error creating AVAssetWriter");
    // Set up asset writer inputs.
    // Set up the asset writer to encode video in the H.264 format, with the height and width equal to that
    // of the framebuffer object.
    NSDictionary *assetWriterInputAttributesDictionary =
    [NSDictionary dictionaryWithObjectsAndKeys:
     AVVideoCodecH264, AVVideoCodecKey,
     [NSNumber numberWithInt:self.width], AVVideoWidthKey,
     [NSNumber numberWithInt:self.height], AVVideoHeightKey,
    AVAssetWriterInput *assetWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo
    // Assume the input pixel buffer is in BGRA format, the iOS standard format.
    NSDictionary *sourcePixelBufferAttributesDictionary =
    [NSDictionary dictionaryWithObjectsAndKeys:
     [NSNumber numberWithInt:kCVPixelFormatType_32BGRA], kCVPixelBufferPixelFormatTypeKey,
     [NSNumber numberWithInt:self.width], kCVPixelBufferWidthKey,
     [NSNumber numberWithInt:self.height], kCVPixelBufferHeightKey,
    _assetWriterPixelBufferInput = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:assetWriterInput
    // Add the input to the writer if possible.
    if ([_assetWriter canAddInput:assetWriterInput]) {
        [_assetWriter addInput:assetWriterInput];
    } else {
        NSAssert(NO, @"Error adding asset writer input");
    // Start the asset writer immediately for this simple example. 
    [_assetWriter startWriting];
    [_assetWriter startSessionAtSourceTime:kCMTimeZero];
    // Store the date when the asset writer started recording video.
    _startDate = [NSDate date];
    // Check the asset writer has started.
    if (_assetWriter.status == AVAssetWriterStatusFailed) {
        NSAssert(NO, @"Error starting asset writer %@", _assetWriter.error);

After this, the offscreen framebuffer object should be set up:

- (void)setupFBO {
    // Make the renderer context current, necessary to create any new OpenGL objects.
    [[ARRenderer getInstance] useContext];
    // Create the FBO.
    glGenFramebuffers(1, &_fbo);
    [self bindBuffer];
    // Create the OpenGL texture cache.
    CVOpenGLESTextureCacheRef cvTextureCache;
    CVReturn err = CVOpenGLESTextureCacheCreate(kCFAllocatorDefault, NULL, [EAGLContext currentContext], NULL, &cvTextureCache);
    if (err) {
        NSAssert(NO, @"Error creating CVOpenGLESTextureCacheCreate %d", err);
    // Create the OpenGL texture we will be rendering to.
    CVPixelBufferPoolRef pixelBufferPool = [_assetWriterPixelBufferInput pixelBufferPool];
    err = CVPixelBufferPoolCreatePixelBuffer (kCFAllocatorDefault, pixelBufferPool, &_pixelBuffer);
    if (err) {
        NSAssert(NO, @"Error creating CVPixelBufferPoolCreatePixelBuffer %d", err);
    CVOpenGLESTextureRef renderTexture;
    CVOpenGLESTextureCacheCreateTextureFromImage (kCFAllocatorDefault, cvTextureCache, _pixelBuffer,
                                                  NULL, // texture attributes
                                                  GL_RGBA, // opengl format
                                                  GL_BGRA, // native iOS format
    // Attach the OpenGL texture to the framebuffer.
    glBindTexture(CVOpenGLESTextureGetTarget(renderTexture), CVOpenGLESTextureGetName(renderTexture));
    glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, CVOpenGLESTextureGetName(renderTexture), 0);
    // Create a depth buffer for correct drawing.
    GLuint depthRenderbuffer;
    glGenRenderbuffers(1, &depthRenderbuffer);
    glBindRenderbuffer(GL_RENDERBUFFER, depthRenderbuffer);
    glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH24_STENCIL8_OES, self.width, self.height);
    glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, depthRenderbuffer);
    // Check the FBO is complete and ready for rendering
    [self checkFBO];

The bindBuffer method should be overridden to ensure we are binding the correct framebuffer when rendering to video:

- (void)bindBuffer {
    glBindFramebuffer(GL_FRAMEBUFFER, _fbo);

Finally, the draw method is called when content is rendered to the framebuffer, and hence should be overridden and extended to handle the encoding of each frame:

- (void)draw {
    // Draw content to the framebuffer as normal.
    [super draw];
    // Prevent encoding of a new frame if the AVAssetWriter is not writing or if the video is completed.
    if (self.assetWriter.status != AVAssetWriterStatusWriting || _isVideoFinishing) {
    // Wait for all OpenGL commands to finish.
    // Lock the pixel buffer to allow it to be used for encoding.
    CVPixelBufferLockBaseAddress(_pixelBuffer, 0);
    // Submit the pixel buffer for the current frame to the asset writer with the correct timestamp.
    CMTime currentTime = CMTimeMakeWithSeconds([[NSDate date] timeIntervalSinceDate:_startDate],120);
    if (![_assetWriterPixelBufferInput appendPixelBuffer:_pixelBuffer withPresentationTime:currentTime]) {
        NSLog(@"Problem appending pixel buffer at time: %lld", currentTime.value);
    // Unlock the pixel buffer to free it.
    CVPixelBufferUnlockBaseAddress(_pixelBuffer, 0);
    // In this simple example, finish the video if it has been recording for 5 seconds.
    if (CMTimeCompare(currentTime, CMTimeMake(5, 1)) == 1) {
        _isVideoFinishing = YES;
        [self.assetWriter finishWritingWithCompletionHandler:^{
            NSLog(@"Finished writing video.");

Using an ARCaptureRenderTarget Object

The simple offscreen render target we have just created, ARCaptureRenderTarget, is simple to use. It records the first five seconds of the AR scene and outputs the resulting video to a "video.mp4" in the application's library directory. This behaviour can easily be made more complex by modifying the above code.

In order to record content from the AR scene, it is necessary to add the scene content to the ARCaptureRenderTarget so that is is draw every frame. This is achieved by adding the following logic during scene setup, most likely located in the setup functions in a AR view controller class. For example, here the setup code is added in a view controller that subclasses from ARViewController:

@implementation ViewController
- (void)setupContent
    // Set up of AR 3D content occurs here.
    // Create the ARCaptureRenderTarget offscreen render target object.
    ARCaptureRenderTarget *captureRenderTarget = [[ARCaptureRenderTarget alloc] initWithWidth:self.view.frame.size.width height:self.view.frame.size.height];
    // Add the viewports that need rendering to the the render target.
    // Camera viewport contains the camera image. Content viewport contains the 3D content.
    [captureRenderTarget addViewPort:self.cameraView.cameraViewPort];
    [captureRenderTarget addViewPort:self.cameraView.contentViewPort];
    // Add the offscreen render target to the renderer.
    [[ARRenderer getInstance] addRenderTarget:captureRenderTarget];

This will be sufficient to allow recording of both the 3D content and the camera image of the AR screen to video.