Building a Meeting Application on iOS using the Amazon Chime SDK

The Amazon Chime SDKs for iOS and Android provide application developers native choices to add audio calling, video calling, and screen share viewing capabilities to their mobile applications. Developers can use the same communication infrastructure that powers Amazon Chime, an online meetings service from AWS, to deliver engaging experiences in their applications. This post demonstrates how to use the Amazon Chime SDK for iOS to add real-time audio and video conferencing to your iOS application.

Solution Overview

The Amazon Chime SDK for iOS provides methods to access the Amazon Chime SDK media services and access local audio and video devices. It supports realtime signaling for audio quality, active speaker events, enhanced echo cancellation, hardware accelerated encoding, decoding and rendering, adaptive bandwidth. It is composed of a native media binary shared with Amazon Chime SDK for Android and a Swift wrapper for easy integration.

Using the Amazon Chime SDK, you can add a unified communication experience to your mobile application on any supported iOS devices. For example, you can add video calling to a healthcare application so patients can consult remotely with doctors on health issues. You can also add audio calling to a company website so customers can quickly connect with sales.

In this post, we first walk through configuring an iOS project. We then guide you through specific usage of the Amazon Chime SDK with sample code. Similarly, if you’re building a meeting application for Android, you can follow Building a Meeting Application on Android using the Amazon Chime SDK.

Prerequisites

You have read Building a Meeting Application using the Amazon Chime SDK. You understand the basic architecture of Amazon Chime SDK and deployed a serverless/browser demo meeting application.
You have a basic to intermediate understanding of iOS development and tools.
You have installed Xcode version 11.0 or later.

Note: Deploying the serverless/browser demo and receiving traffic from the demo created in this post can incur AWS charges.

Key steps outline

Here is an outline of the key steps involved in integrating the Amazon Chime SDK into your iOS application

Configure your application
Create a meeting session
Access AudioVideoFacade
Handle real-time events
Render a video tile
Test
Cleanup
Conclusion

Configure your application

To declare the Amazon Chime SDK as a dependency, you must complete the following steps.

Follow the steps in the Setup section in the README file to download and import the Amazon Chime SDK.
Add Privacy - Microphone Usage Description and Privacy - Camera Usage Description to the Info.plist of your Xcode project.

Request microphone and camera permissions. You can use AVAudioSession.recordPermission and AVCaptureDevice.authorizationStatus by handling the response synchronously and falling back to requesting permissions. You can also use requestRecordPermission and requestAccess with an asynchronous completion handler.

switch AVAudioSession.sharedInstance().recordPermission {
  case AVAudioSessionRecordPermission.granted:
    // You can use audio.
    ...
  case AVAudioSessionRecordPermission.denied:
    // You may not use audio. Your application should handle this.
    ...
  case AVAudioSessionRecordPermission.undetermined:
    // You must request permission.
    AVAudioSession.sharedInstance().requestRecordPermission({ (granted) in
      if granted {
        ...
      } else {
        // The user rejected your request.
      }
    })
}

// Request permission for video. You can similarly check
// using AVCaptureDevice.authorizationStatus(for: .video).
AVCaptureDevice.requestAccess(for: .video)

Create a meeting session

To start a meeting, you need to create a meeting session. We provide DefaultMeetingSession as an actual implementation of the protocol MeetingSession. DefaultMeetingSession takes in both MeetingSessionConfiguration and ConsoleLogger.

Create a ConsoleLogger for logging.

let logger = ConsoleLogger(name: "MeetingViewController")

Make a POST request to server_url to create a meeting and an attendee. The server_url is the URL of the serverless demo meeting application you deployed (see Prerequisites section).

var url = "\(server_url)join?title=\(meetingId)&name=\(attendeeName)&region=\(meetingRegion)"
url = encodeStrForURL(str: url)

// Helper function for URL encoding.
public static func encodeStrForURL(str: String) -> String {
    return str.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed) ?? str
}

Create a MeetingSessionConfiguration. JSON response of the POST request contains data required for constructing a CreateMeetingResponse and a CreateAttendeeResponse.

let joinMeetingResponse = try jsonDecoder.decode(MeetingResponse.self, from: data)

// Construct CreatMeetingResponse and CreateAttendeeResponse.
let meetingResp = CreateMeetingResponse(meeting:
    Meeting(
        externalMeetingId: joinMeetingResponse.joinInfo.meeting.meeting.externalMeetingId,
        mediaPlacement: MediaPlacement(
            audioFallbackUrl: joinMeetingResponse.joinInfo.meeting.meeting.mediaPlacement.audioFallbackUrl,
            audioHostUrl: joinMeetingResponse.joinInfo.meeting.meeting.mediaPlacement.audioHostUrl,
            signalingUrl: joinMeetingResponse.joinInfo.meeting.meeting.mediaPlacement.signalingUrl,
            turnControlUrl: joinMeetingResponse.joinInfo.meeting.meeting.mediaPlacement.turnControlUrl
        ),
        mediaRegion: joinMeetingResponse.joinInfo.meeting.meeting.mediaRegion,
        meetingId: joinMeetingResponse.joinInfo.meeting.meeting.meetingId
    )
)

let attendeeResp = CreateAttendeeResponse(attendee:
    Attendee(attendeeId: joinMeetingResponse.joinInfo.attendee.attendee.attendeeId,
        externalUserId: joinMeetingResponse.joinInfo.attendee.attendee.externalUserId,
        joinToken: joinMeetingResponse.joinInfo.attendee.attendee.joinToken
    )
)

// Construct MeetingSessionConfiguration.
let meetingSessionConfig = MeetingSessionConfiguration(
    createMeetingResponse: currentMeetingResponse,
    createAttendeeResponse: currentAttendeeResponse
)

Now create an instance of DefaultMeetingSession.

let currentMeetingSession = DefaultMeetingSession(    
    configuration: meetingSessionConfig,
    logger: logger
)

Access AudioVideoFacade

AudioVideoFacade is used to control audio and video experience. Inside the DefaultMeetingSession object, audioVideo is an instance variable of type AudioVideoFacade.

To start audio, you can call start on AudioVideoFacade.

do {
    try self.currentMeetingSession?.audioVideo.start()
} catch PermissionError.audioPermissionError {
    // Handle the case where no permission is granted.
} catch {
    // Catch other errors.
}

Your application now should be able to exchange audio streams. The Amazon Chime SDK also provides an API called chooseAudioDevice to change the audio input and output devices. listAudioDevices can be used to list all the available audio devices.

let optionMenu = UIAlertController(title: nil, message: "Choose Audio Device", preferredStyle: .actionSheet)
for inputDevice in self.currentMeetingSession!.audioVideo.listAudioDevices() {
    let deviceAction = UIAlertAction(
        title: inputDevice.label,
        style: .default,
        handler: { _ in self.currentMeetingSession?.audioVideo.chooseAudioDevice(mediaDevice: inputDevice)
    })
    optionMenu.addAction(deviceAction)
}

You can turn local audio on and off by calling the mute and unmute APIs.

// Mute audio.
self.currentMeetingSession?.audioVideo.realtimeLocalMute()
// Unmute audio.
self.currentMeetingSession?.audioVideo.realtimeLocalUnmute()

There are two sets of APIs for starting and stopping video. startLocalVideo and stopLocalVideo are for turning on and off the camera on the user’s device. startRemoteVideo and stopRemoteVideo are for receiving videos from other participants on the same meeting.

// Start local video.
do {
    try self.currentMeetingSession?.audioVideo.startLocalVideo()
} catch PermissionError.videoPermissionError {
    // Handle the case where no permission is granted.
} catch {
    // Catch some other errors.
}

// Start remote video.
self.currentMeetingSession?.audioVideo.startRemoteVideo()

// Stop local video.
self.currentMeetingSession?.audioVideo.stopLocalVideo()

// Stop remote video.
self.currentMeetingSession?.audioVideo.stopRemoteVideo()

You can switch the camera for local video between front-facing and rear-facing. Call switchCamera and have different logic based on the camera type returned by calling getActiveCamera.

self.currentMeetingSession?.audioVideo.switchCamera()

// Add logic to respond to camera type change.
switch self.currentMeetingSession?.audioVideo.getActiveCamera().type {
case MediaDeviceType.videoFrontCamera:
    ...
case MediaDeviceType.videoBackCamera:
    ...
default:
    ...
}

Handle real-time events

We want to handle various real-time events during the meeting to update the UI accordingly. Events are triggered when attendees join or leave the meeting, metrics become available, audio is muted or unmuted, audio or video device is changed, video is enabled or disabled, or active talker changes. The Amazon Chime SDK provides several observer interfaces including AudioVideoObserver, MetricsObserver, RealtimeObserver, DeviceChangeObserver, VideoTileObserver and ActiveSpeakerObserver. These can be implemented in your application to handle those events. Let’s look at the following samples based on different interfaces.

AudioVideoObserver
AudioVideoObserver is used to monitor the status of audio or video sessions. To subscribe to this observer, you call addAudioVideoObserver on AudioVideoFacade:

extension MeetingViewController: AudioVideoObserver {
    func audioSessionDidStartConnecting(reconnecting: Bool) {}
    func audioSessionDidStart(reconnecting: Bool) {}
    func audioSessionDidStopWithStatus(sessionStatus: MeetingSessionStatus) {}
    func audioSessionDidCancelReconnect() {}
    func videoSessionDidStartConnecting() {}
    func videoSessionDidStartWithStatus(sessionStatus: MeetingSessionStatus) {}
    func videoSessionDidStopWithStatus(sessionStatus: MeetingSessionStatus) {}
    func connectionDidRecover() {}
    func connectionDidBecomePoor() {}
}

// Register observer.
self.currentMeetingSession?.audioVideo.addAudioVideoObserver(observer: self)

RealtimeObserver
RealtimeObserver is used to maintain a list of attendees and their audio volume and signal strength status. This observer only notifies the change since the last notification. For example, if one attendee becomes muted, only that attendee’s AttendeeInfo is supplied in an array to onAttendeesMute. That same attendee will no longer appear in future onVolumeChange callbacks.
```
extension MeetingViewController: RealtimeObserver {
    func volumeDidChange(volumeUpdates: [VolumeUpdate]) {}
    func signalStrengthDidChange(signalUpdates: [SignalUpdate]) {}
    func attendeesDidJoin(attendeeInfo: [AttendeeInfo]) {}
    func attendeesDidLeave(attendeeInfo: [AttendeeInfo]) {}
    func attendeesDidMute(attendeeInfo: [AttendeeInfo]) {}
    func attendeesDidUnmute(attendeeInfo: [AttendeeInfo]) {}
}

// Register observer.
self.currentMeetingSession?.audioVideo.addRealtimeObserver(observer: self)
```
AttendeeInfo contains both attendeeId and externalUserId. If the attendee is sharing the screen, the associated attendeeId will have a trailing #content.

MetricsObserver
MetricsObserver is used to monitor the changes in media metrics.

extension MeetingViewController: MetricsObserver {
    func metricsDidReceive(metrics: [AnyHashable: Any]) {}
}

// Register observer.
self.currentMeetingSession?.audioVideo.addMetricsObserver(observer: self)

DeviceChangeObserver
DeviceChangeObserver detects changes in available MediaDevices, including both audio and video devices. When a new audio device, such as a Bluetooth headset, is connected or disconnected, a corresponding onAudioDeviceChange callback will be invoked. The new device is then listed in freshAudioDeviceList.
```
extension MeetingViewController: DeviceChangeObserver {
    func audioDeviceDidChange(freshAudioDeviceList: [MediaDevice]) {}
}

// Register observer.
self.currentMeetingSession?.audioVideo.addDeviceChangeObserver(observer: self)
```

VideoTileObserver
VideoTileObserver has two callback functions, onAddVideoTile and onRemoveVideoTile. They are triggered when a video tile is added or removed by calling bindVideoView and unbindVideoView methods of AudioVideoFacade.

extension MeetingViewController: VideoTileObserver {
    func videoTileDidAdd(tileState: VideoTileState) {}
    func videoTileDidRemove(tileState: VideoTileState) {}
    func videoTileDidPause(tileState: VideoTileState) {}
    func videoTileDidResume(tileState: VideoTileState) {}
}

// Register observer.
self.currentMeetingSession?.audioVideo.addVideoTileObserver(observer: self)

ActiveSpeakerObserver
ActiveSpeakerObserver identifies which attendee in the meeting is actively speaking. In the protocol of ActiveSpeakerObserver, you add the stubs of observerId. This can uniquely identify this observer and activeSpeakerDidDetect function. DefaultActiveSpeakerPolicy extends the protocol of ActiveSpeakerPolicy. You can implement your own policy by extending ActiveSpeakerPolicy and use it for activeSpeakerSubscribe in your application.
```
extension MeetingViewController: ActiveSpeakerObserver {
    var observerId: String {
        return self.uuid
    }
    func activeSpeakerDidDetect(attendeeInfo: [AttendeeInfo]) {}
}

// Register observer.
self.currentMeetingSession?.audioVideo.activeSpeakerSubscribe(policy: DefaultActiveSpeakerPolicy(), observer: self)
```

Render a video tile

You bind the local and remote video streams to the UI elements to visualize the video tiles. Each DefaultVideoTile has an associated VideoTileState, which includes the unique tileId and attendeeId. It also has other attributes for the video tile such as isLocalTile to indicate whether the tile represents a local or remote video stream. You can also show whether the video stream is for content sharing with isContent. A tile can be paused.

DefaultVideoRenderView is used to render the frames of videos on UIImageView. Once you have both VideoTileState and a DefaultVideoRenderView, you can bind them by calling bindVideoView.

// Bind video tile.
self.currentMeetingSession?.audioVideo.bindVideoView(videoView: someDefaultVideoRenderView, tileId: someTileState.tileId)

// Unbind video tile.
self.currentMeetingSession?.audioVideo.unbindVideoView(tileId: someTileState.tileId)

// Pause video tile.
self.currentMeetingSession?.audioVideo.pauseRemoteVideoTile(tileId: someTileState.tileId)

// Resume video tile.
self.currentMeetingSession?.audioVideo.resumeRemoteVideoTile(tileId: someTileState.tileId)

Test

After building and running your iOS application, you can verify the end-to-end behavior. Test it by joining the same meeting from your iOS device and a browser (using the demo application you set up in the prerequisites).

Amazon Chime SDK iOS Demo Application

Cleanup

If you no longer want to keep the demo active in your AWS account and want to avoid incurring AWS charges, the demo resources can be removed. Delete the two AWS CloudFormation stacks created in the prerequisites that can be found in the AWS CloudFormation console.

Conclusion

This post has covered the basic APIs in the Amazon Chime SDK for iOS including meeting session, audio and video facade, and observers. You can download the complete demo application in our Amazon Chime SDK for iOS Github repository. The demo application uses the Amazon Chime SDK to start a meeting with real-time audio and video. You can input the meeting ID and name to join the meeting. Once in a meeting, you can share your audio and video with other participants using Android, iOS, and web clients.

Business Productivity