Business Productivity

Monitoring and troubleshooting with Amazon Chime SDK meeting events

Meeting events allow you to collect client metrics from your audio, video, and screen sharing applications using the Amazon Chime SDK. By integrating meeting events with Amazon CloudWatch, you can view a snapshot of critical information in the Amazon CloudWatch dashboard. For example, if a user encounters a failure while joining a meeting, you can access an Amazon Chime SDK meeting status code and error message in the dashboard. You can use this data to identify the reason for the failure without requiring the user to do additional work, like providing client logs. You also have latency metrics to determine if a connection problem affects the user’s call quality.

In this post, you will learn how to:

  • Upload meeting events from your application to Amazon CloudWatch
  • Investigate why your users may have join failures
  • Troubleshoot audio quality issues
  • Monitor microphone and camera setup failures

Prerequisites

Before getting started, you must meet the following prerequisites:

Solution overview

In your web and mobile applications enabled by the Amazon Chime SDK, you can add an event observer to collect meeting events throughout the session. In this post, you learn how to make a POST request to upload meeting events to your AWS Lambda function when a meeting ends or stops with an error. You can send meeting events when a session is running; however, it might impact performance and meeting quality.

The Lambda function uploads meeting events to an Amazon CloudWatch log stream for searching and analyzing with Amazon CloudWatch Logs Insights. The function also publishes the latency metrics to Amazon CloudWatch Metrics to view the aggregate statistics. Using these two sources, you create an Amazon CloudWatch dashboard displaying selected graphs and logs for incidents in your applications. The dashboard is a starting point for you to investigate and troubleshoot your users’ unique problems.

This post uses Amazon CloudWatch as an example, but you can integrate meeting events with other analysis and visualization services.

Deploying the solution

We have provided an AWS CloudFormation template in the “Amazon Chime SDK Samples” repository to provision the solution described in the preceding section. Some resources deployed by this stack incur costs when in use. To provision your resources, follow these instructions.

  1. Copy the contents of the AWS CloudFormation template from GitHub and save it in a new file named meeting-events-blog-template.yaml.
  2. Open the AWS CloudFormation console and choose Create stack.
  3. Choose Upload a template file, and then browse for the meeting-events-blog-template.yaml file.
  4. For Stack name, type amazon-chime-sdk-meeting-events-demo and choose Next.
  5. Review Specify stack details and choose Next. You can use the default values or specify your stack name and parameters.
  6. Review Configure stack options and choose Next. You can update the tags and permissions applied to the stack.
  7.  Review the stack details, choose I acknowledge that AWS CloudFormation might create IAM resources, and then choose Create stack.

After the deployment is complete, you get an Amazon API Gateway endpoint for uploading meeting events and a link to an Amazon CloudWatch dashboard for viewing selected metrics and logs.

  1. On the AWS CloudFormation console, choose Outputs and take a note of the MeetingEventApiEndpoint value.
  2. Choose the MeetingEventDashboard link to open your Amazon CloudWatch dashboard. The dashboard is empty now. Once you send meeting events to the Amazon API Gateway endpoint in the next section, you see data like the following images:
  3. The dashboard also displays latency graphs and meeting events for specific incidents.

Uploading meeting events from your applications

This section includes sample code for Amazon Chime SDK for JavaScript. For Android and iOS applications, follow the steps in the meeting event guides.

Once you create a MeetingSession object, you can add an event observer to receive meeting events. This post uses examples of sending the last 5 minutes of the meetingHistory attribute only for the following failure events: audioInputFailed, videoInputFailed, meetingStartFailed, and meetingFailed.

To receive meeting events in your web application, add an audio-video observer to implement the eventDidReceive method. For more information about available events and attributes, see the Meeting Events guide.

let meetingEvents = [];

meetingSession.audioVideo.addObserver({
  eventDidReceive: (name, attributes) => {
    const { meetingHistory, ...otherAttributes } = attributes;
    switch (name) {
      case 'audioInputFailed':
      case 'videoInputFailed':
      case 'meetingStartFailed':
      case 'meetingFailed':
        meetingEvents.push({
          name,
          attributes: {
            ...otherAttributes,
            meetingHistory: meetingHistory.filter(({ timestampMs }) => {
              return Date.now() - timestampMs < 5 * 60 * 1000;
            }),
          },
        });
        break;
      default:
        meetingEvents.push({
          name,
          attributes: otherAttributes,
        });
        break;
    }
  }
});

When a meeting ends, you can make a POST request to upload meeting events to the endpoint you created. To do so, set the endpoint to the MeetingEventApiEndpoint value from the Outputs tab of the AWS CloudFormation console.

const API_ENDPOINT_URL = /* MeetingEventApiEndpoint from the preceding section */;

meetingSession.audioVideo.addObserver({
  audioVideoDidStop: () => {
    setTimeout(() => {
      if (meetingEvents.length > 0) {
        fetch(API_ENDPOINT_URL, {
          method: 'POST',
          body: JSON.stringify(meetingEvents)
        });
        meetingEvents = [];
      }
    }, 0);
  },
});

If the user closes an application early or a meeting ends due to a poor internet connection, your application may fail to send meeting events. To handle these cases, you can store meeting events in persistent storage, such as the browser’s localStorage, and deliver them when the next available session ends.

Collecting meeting events

Now your applications can upload meeting events to Amazon CloudWatch. In your application, run a few Amazon Chime SDK meetings to ensure that the dashboard displays data like the following image. It may take a few minutes for your meeting events to show up on the dashboard.

Once you confirm the dashboard works as expected, you can start collecting your users’ meeting events. If you encounter 500 internal server errors, check your AWS Lambda function logs and verify that the Amazon CloudWatch API requests are not throttled. For more information about Amazon CloudWatch quotas, see Amazon CloudWatch quotas and Amazon CloudWatch Logs quotas.

Investigating meeting join failures

If a user fails to join a meeting, your applications receive the meetingStartFailed event from the Amazon Chime SDK and publish it to Amazon CloudWatch. You can find this meeting event in the Meeting join failures widget of your Amazon CloudWatch dashboard.

  1. To open your Amazon CloudWatch dashboard, choose the MeetingEventDashboard link in the AWS CloudFormation console’s Outputs tab.
  2. Change the time range of the dashboard based on your user’s report.
  3. The Meeting join failures widget shows one or more failures given the time range.
  4. Expand a row to view all attributes recorded by the Amazon Chime SDK when an event occurs.
  5. Ensure that attributes.attendeeId and attributes.externalUserId match your user’s information.

Once you find a meeting event for a specific incident, you should first verify that the user attempted to join a meeting using a supported browser and operating system. See Amazon Chime SDK system requirements.

To investigate the details of the incident, review these two attributes.

  • attributes.meetingStatus — The Amazon Chime SDK status when your user failed to join the meeting. This attribute indicates a status code defined in MeetingSessionStatusCode. For example, if you see the MeetingEnded status, your user might have attempted to join a meeting that has already ended.

    In the Amazon Chime SDK for JavaScript, TaskFailed is a general error code triggered when a connection fails or is timed out. Review the attributes.meetingErrorMessage in the next step to learn the details.

    For more information about the status codes, see the following files in the GitHub repositories.

  • attributes.meetingErrorMessage — The full error message that explains why the meeting has failed. This attribute contains the same error message that the Amazon Chime SDK outputs in your applications. For example, if the user encounters a timeout error, the Amazon Chime SDK for JavaScript logs which task failed to complete in the web console.

Troubleshooting poor audio quality

First, you need to find the meetingFailed or meetingEnded event of your user who experienced an audio quality issue. Follow the previous section’s steps to search through a list of meetingFailed events in the Dropped attendees widget of your Amazon CloudWatch dashboard. The Amazon Chime SDK triggers the meetingFailed event when the user gets disconnected from a meeting with an error.

If your user leaves a meeting without an error, use the following steps to find the meetingEnded event.

  1. Hover over the Dropped attendees and choose View in CloudWatch Logs Insights under the menu.
  2. Change the time range in Amazon CloudWatch Logs Insights based on your user’s report.
  3. In the query editor, replace the first line with the following command. Make sure that you assign your user’s attendee ID to attributes.attendeeId.
    filter name = "meetingEnded" and attributes.attendeeId = "attendee-id"
  4. Run the query to find the meetingEnded event published from your user’s applications.

Once you find a specific meeting event, review the following metric attributes to check if the user had a connectivity issue.

  • attributes.meetingStartDurationMs (Amazon Chime SDK for JavaScript only) — This is the time that elapsed between the meetingStartRequested event and the meetingStartSucceeded event. This attribute indicates how long an attendee takes to join a meeting.

    Compare this value to the Meeting start duration widgets in the dashboard that displays aggregate statistics from all users given the time range.
  • attributes.poorConnectionCount and attributes.retryCount — You can use these two metrics to determine how often the user has experienced poor audio or video quality during the meeting.

You can also use the meeting history attribute to troubleshoot an audio quality issue.

  • attributes.meetingHistory — A list of user actions and events since the creation of the MeetingSession object. In this post, you upload the last 5 minutes of the meeting history. Take a look at these meeting history states related to the connectivity of an Amazon Chime SDK meeting.

    The meetingReconnected state in the history tells you that the Amazon Chime SDK has restarted a meeting due to connection issues. A poor network connection can affect audio quality in a call.

    The Amazon Chime SDK for JavaScript also emits the receivingAudioDropped state in the meeting history when a significant number of audio packets drop. Excessive packet loss can decrease audio quality. When a signaling connection is not reliable, you should be able to observe the signalingDropped state in the meeting history.

Monitoring microphone and camera setup failures

When other meeting attendees cannot hear audio from a specific user, consider the following remediation. First, ensure that the user has not muted their audio input. Second, check to make sure the application has chosen a microphone available in the user’s local machine. Note that the Amazon Chime SDK for JavaScript allows attendees to join a meeting without audio input.

If your user still has an audio problem, search for the audioInputFailed event in the Audio and video input failures widget of your Amazon CloudWatch dashboard. You can filter audioInputFailed events by a specific attendee ID.

Similar to the process you used to troubleshoot an audio issue, inspect the videoInputFailed event for camera setup failures. For the audioInputFailed and videoInputFailed events, these two attributes help you identify device errors.

  • attributes.audioInputErrorMessage and attributes.videoInputErrorMessage — You can use these attributes to identify error messages captured in the user’s client logs when microphone or camera selection fails. In the Amazon Chime SDK for JavaScript, this error message could indicate getUserMedia API errors. For example, if you see NotAllowedError in Firefox, the user denies microphone permission. NotReadableError means that another application or browser tab might already use a camera.

Cleaning up

To avoid incurring any unintended charges, delete the AWS CloudFormation stack that you created.

Amazon CloudWatch does not support metric deletion, but they expire based on the retention schedules. Metrics are charged based on a prorated basis, so you are not charged for their existence. For more information, see Amazon CloudWatch FAQs.

Conclusion

This post walked through how to publish meeting events from your applications to Amazon CloudWatch. You also learned how to help your users troubleshoot session failures, audio quality, and device setup issues. When used in combination with the Amazon Chime SDK client logs, you can get a more complete picture of your applications’ health and performance.

As the next steps, we suggest you customize your Amazon CloudWatch dashboard with other meeting events and attributes not listed in the post. For more information about available events and attributes, see the meeting guides in the following links: Amazon Chime SDK for JavaScriptAmazon Chime SDK for Android, and Amazon Chime SDK for iOS.

Kyu Simm

Kyu Simm

Kyu Simm is a Software Engineer on the Amazon Chime team. He is passionate about building web and mobile applications with AWS services.