AWS for M&E Blog

How to use Amazon Rekognition Video for product placement in video

Product placement in video is not a new concept. In fact, the first occurrence is in 1927 when the first movie to win a Best Picture Oscar (Wings) has a scene where a chocolate bar is eaten, followed by a long close-up of the chocolate’s logo. Imagine if viewers in 1927 could right there and then buy those chocolates!

The proposed solution combines two worlds that exist separately today; video consumption and online shopping. In this solution, we use AWS services such as Amazon Rekognition Video, AWS Lambda, Amazon API Gateway, and Amazon Simple Storage Service (Amazon S3).

In this post, we demonstrate how to use Rekognition Video and other services to extract labels from videos. We then review how to display the extracted video labels as hyperlinks in a simple webpage page.

Solution overview

This workflow pipeline consists of AWS Lambda to trigger Rekognition Video, which processes a video file when the file is dropped in an Amazon S3 bucket, and performs labels extraction on that video. The extracted labels are then saved to S3 bucket as a JSON file (see appendix A for JSON file snippet). The workflow also updates an index file in JSON format that stores metadata data of the video files processed.

On the video consumption side, we built a simple web application that makes REST API calls to API Gateway.  The web application is a static web application hosted on S3 and serviced through Amazon CloudFront. When the page loads, the index of videos and their metadata is retrieved through a REST ASPI call. The index file contains the list of video title names, relative paths in S3, the GIF thumbnail path, and JSON labels path. The source of the index file is in S3 (see appendix A for ALL JSON Index file snippet).

GIF previews are available in the web application. When you select the GIF preview, the video loads and plays on the webpage. The GIF, video files, and other static content are served through S3 via CloudFront. The web application makes a REST GET method request to API Gateway to retrieve the labels, which loads the content from the JSON file that was previously saved in S3.

As you interact with the video (Mouse-on), labels begin to show underneath the video and as rectangles on the video itself.

You can pause the video and press on a label (examples “laptop”, “sofa” or “lamp”) and you are taken to amazon.com to a list of similar items for sale (laptops, sofas or lamps).

The following diagram illustrates the process in this post.

This diagram illustrates the process in this post

The workflow is as follows:

1. A video file is uploaded into S3 bucket.

2. The file upload to S3 triggers the Lambda function.

3. Lambda in turn invokes Rekognition Video to start label extraction, while also triggering MediaConvert to extract 20x JPEG thumbnails (to be used later to create a GIF for video preview).

4. Once label extraction is completed, an SNS notification is sent via email and is also used to invoke the Lambda function

5. The Lambda function in turn triggers another Lambda function that stitches the JPEG thumbnails into a GIF, while also dropping the labels JSON file into S3 bucket.

6. Lambda places the Labels JSON file into S3 and updates the Index JSON, which contains metadata of all available videos.

a.GIF file is placed into S3 bucket. At this point, in S3 the following components exist:a. Original video
b. Labels JSON file
c. Index JSON file
d. JPEG thumbnails
e. GIF preview

7. Content is requested in the webpage through browser

8. Request is sent to API GW and CloudFront distribution

9. Request to API GW is passed as GET method to Lambda function, which in turn retrieves the JSON files from S3 and sends them back to API GW as a response. CloudFront (CF) sends request to the origin to retrieve the GIF files and the video files. Caching can be used to reduce latency, by not going to the origin (S3 bucket) if content requested is already available in CF.

10.Responses to API GW and CF are sent back- JSON files and GIF and video files respectively.

11. Content and labels are now available to the browser and web application.

Steps

1. Create S3 Bucket

Amazon S3 bucket is used to host the video files and the JSON files.

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases. Customers use it for websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics.

In this solution, the input video files, the label files, thumbnails, and GIFs are placed in one bucket. However, they will be organized into different folders within the bucket.

As part of our account security policies, S3 public access is set to off, and access to content is made available through CloudFront CDN distribution.

a. From the AWS Management Console, search for S3:

A screenshot of AWS Management Console and typing S3 under Find Servies to create a new S3 bucket.

b. Choose Create Bucket:

A screenshot of AWS Console showing S3 Buckets

c. Provide a Bucket name and choose your Region:

A screenshot of AWS Console showing Amazon S3 Create Bucket window. Under Bucketname field, a new bucket name is inserted. Bucket name is newbucket-may-2020. The region field is showing US East(N.Virginia) us-east-1d. Keep all other settings as is, and choose Create Bucket:

A screenshot of AWS Console showing the security settings of the bucket created, where all public access is blocked. The window also shows the create bucket button that should be clicked.

e. Choose the newly created bucket in the bucket dashboard:

A screenshot of AWS Console showing the newly created bucket

f. Select Create Folder in the bucket:

A screenshot of AWS Console where a new folder can be created within the new bucket

g. Give your folder a name and then choose Save:

A screenshot of AWS Console showing the field to input the name of the new folder. Here it is called input-files. The encryption setting of the obkect is selected to None. There is a save button that should then be clicked.

h. Create S3 Bucket Policy:

The following policy enables CloudFront to access and get bucket contents. We describe how to create CloudFront Identity later in the post.

{
    "Version": "2008-10-17",
    "Id": "PolicyForCloudFrontPrivateContent",
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity xxxxxxxxxxxxxx"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::my-bucket/*"
        }
    ]
}

CloudFront enablement code

You are now ready to upload video files (.mp4) into S3.

2. Configure Simple Notification Service (SNS):

SNS is a key part of this solution, as we use it to send notifications when the label extraction job in Rekognition is either successfully done, or has failed. It also invokes Lambda to write the Labels into S3. Subscriptions to the notifications were set up via email.

Amazon Simple Notification Service (Amazon SNS) is a web service that sets up, operates, and sends notifications from the cloud. It provides developers with a highly scalable, flexible, and cost-effective capability to publish messages from an application and immediately deliver them to subscribers or other applications.

a. In the Management Console, choose Simple Notifications Service
b. Select Topics from the pane on the left-hand side
c. Choose Create topic:

A screenshot of the AWS management console, showing Amazon SNS window. There is a create topic button to be selected

d. Add a name to the topic and select Create topic
e. Now a new topic has been created, but currently has no subscriptions. Choose Create subscription:

A screenshot of AWS console for SNS showing the new topic has been created, Create subscription button is to be selected

f. In the Protocol selection menu, choose Email:

A screenshot of the Create Subscription window, with drop down menu for Protocol. Email is seleceted.

g. Within the Endpoint section, enter the email address that you want to receive SNS notifications, then select Create subscription:

A screenshot of the Create Subscription window. The endpoint filed includes email address. myemail@example.com. There is a create subscription button to be selected

The following is a sample notification email from SNS, confirming success of video label extraction:

A screenshot showing email snippet from SNS, highlighting TOPIC ARN, STATUS SUCCESS and Unsbscribe URL.

3. Create Lambda Functions:

For this solution we created five Lambda functions, described in the following table:

Table describing five different Lambda functions

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume – there is no charge if your code is not running. With Lambda, you can run code for virtually any type of application or backend service—all with zero administration. You upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.

3.1. Lambda Function 1:

Lambda Function 1 achieves two goals. First, it triggers Amazon Rekognition Video to start Label Detection on the video input file. Second, it invokes Lambda Function 3 to trigger AWS Elemental MediaConvert to extract JPEG images from the video. We stitch these together into a GIF file later on to create animated video preview. Worth noting that in this function, we are using Min Confidence for labels extracted = 75. Changing this value affects how many labels are extracted.

Amazon Rekognition Video is a deep learning powered video analysis service that detects activities, understands the movement of people in frame, and recognizes people, objects, celebrities, and inappropriate content from your video stored in Amazon S3. Results are paired with timestamps so that you can easily create an index to facilitate highly detailed video search.

AWS Elemental MediaConvert is a file-based video transcoding service with broadcast-grade features. It allows you to focus on delivering compelling media experiences without having to worry about the complexity of building and operating your own video processing infrastructure.

a. To create the Lambda function, go to the Management Console and find Lambda.
b. Add the S3 bucket created in Step 1 as the trigger

A screenshot of Lambda Console for Lambda function # 1. The window demonstrates the configuration section of the console and S3 added as the trigger.

c. Add Execution Role:

A screenshot of Lambda's console window demonstrating execution role and the role name.

 

From Identity Access Management (IAM), this role includes full access to Rekognition, Lambda, and S3.

d. Code snippets:

import json
import boto3
def start_label_detection(bucketname,key):
    client = boto3.client('rekognition')
    response = client.start_label_detection(
        Video={
            'S3Object': {
                'Bucket': bucketname,
                'Name': key

            }
        },

        MinConfidence=75,
        NotificationChannel={
            'SNSTopicArn': 'arn:aws:sns:sns-arn',
            'RoleArn': 'arn:aws:iam-arn:role/rekognition-output-to-sns'
        }

    )
    print("label response: ", response)
    

def invoke_mediaconvert(bucket, key):
    client = boto3.client('lambda')
    response = client.invoke(
        FunctionName='Function-name',
        InvocationType='Event',
        LogType='Tail',
        #ClientContext='string',
        Payload=json.dumps({"bucket": bucket, "key": key})
        #Qualifier='string') 
        )

e. Configure test events to test the code

3.2. Lambda Function 2

The second Lambda function achieves a set of goals:

  • Invokes Lambda function #4 that converts JPEG images to GIF
  • Triggers SNS in the event of Label Detection Job Failure.
  • Writes Labels (extracted through Rekognition) as JSON in S3 bucket.
  • Creates JSON tracking file in S3 that contains a list pointing to: Input Video path, Metadata JSON path, Labels JSON path, and GIF file Path.

a. To create the Lambda function, go to the Management Console and find Lambda.
b. Add the SNS topic created in Step 2 as the trigger:

A screenshot of Lambda Console for Lambda function # 2. The window demonstrates the configuration section of the console and SNS added as the trigger.

c. Add environment variables pointing to the S3 Bucket, and the prefix folder within the bucket:

A description of the Environment Variables from Lambda console.

d. Add Execution Role, which includes access to S3 bucket, Rekognition, SNS, and Lambda.

A screenshot of Lambda's console window showing execution role and the role name.

e. Lambda function code snippets:

import json
import boto3
from botocore.errorfactory import ClientError
import os


#Invoke Lambda function: giftranscode
def invoke_gif(bucket, key, jobID):
    #invokes another lambda which uses elastic transcoder to create a GIF file
    client = boto3.client('lambda')
    response = client.invoke(
        FunctionName='giftranscode',
        InvocationType='Event',
        LogType='Tail',
        Payload=json.dumps({"bucket": bucket, "key": key, "jobID": jobID})
        )


    
def shrinkLabels(labels):
    outlabels = []
    for item in labels:
        ismatch = False
        instances = item.get("Label", {}).get("Instances", [])
        if instances != []:
            ismatch = True
        if ismatch:
           outlabels.append(item)
    return  outlabels
    
def WriteObjectToS3AsJson(thisObject, bucket, key):
    client = boto3.client('s3')
    jsonobject = json.dumps(thisObject)
    response = client.put_object(Body=bytes(jsonobject, 'utf-8'),Bucket=bucket, Key = key)

def ReadFileAsJsonFromS3(bucket, key):
    client = boto3.client('s3')
    try:
        response = client.get_object(Bucket=bucket,Key=key)
    except ClientError:
        return []
    jsonasRawText = response['Body'].read().decode('utf-8')
    loadedDoc = json.loads(jsonasRawText)
    return loadedDoc

def AddUpdateProjectTracking(newObject):
    #we update video entry meta data in the index JSON file if they do not exist
    #if the video is not yet in the index JSON, it gets added
    bucket = os.environ["indexbucket"]
    key = os.environ["indexkey"]
    currentList = ReadFileAsJsonFromS3(bucket, key)
    videosList = []
    newList = []
    for item in reversed(currentList):
        if item["rawvideopath"] != newObject["rawvideopath"]:
            if item["rawvideopath"] not in videosList:
                newList.append(item)
                videosList.append(item["rawvideopath"])
    newList.append(newObject)
    WriteObjectToS3AsJson(newList, bucket, key)

    AddUpdateProjectTracking(newTrackObject)
def lambda_handler(event, context):
    print(json.dumps(event))
    for record in event['Records']:
        Message=json.loads(record['Sns']['Message'])
        print(Message)
        s3bucket=Message['Video']['S3Bucket']
        jobid=Message['JobId']
        status=Message['Status']
        if status != 'SUCCEEDED':
            SNSfailure (Message)
        else: 
            s3objectname=Message['Video']['S3ObjectName']
            get_label_detection(jobid, s3bucket, s3objectname, SortBy='TIMESTAMP')
            invoke_gif(s3bucket, s3objectname, jobid)
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

f. Configure Test events to test the code.

3.3. Lambda Function 3:

This function triggers AWS Elemental MediaConvert to extract JPEG thumbnails from video input file.
MediaConvert is triggered through Lambda. Some of the key settings are:

  • Number of JPEG captures: 20
  • Width: 266
  • Height: 150
  • Frame Capture Settings: 1/10 [FramerateNumerator / FramerateDenominator]: this means that MediaConvert takes the first frame, then one frame every 10 seconds.
  • Quality: 80

a. To create the Lambda function, go to the Management Console and find Lambda.
b. This Lambda function is being triggered by another Lambda function (Lambda Function 1), hence no need to add a trigger here.
c. Add Environment Variables: Bucket name, and the subfolder prefix within the bucket for where the JPEG images will go:

A screenshot of the Environment Variables from Lambda console.

d. Add Execution Role that includes access to S3, MediaConvert, and CloudWatch.

A screenshot demonstrating the Execution role of Lambda function #3e. Code Snippets:

import json
import boto3
import os

def convert_to_jpg(bucketname,key):
  client = boto3.client('mediaconvert', endpoint_url=os.environ[“mediaconvertendpoint”)
  response = client.create_job(
  
    Queue= "arn:aws:mediaconvert:mediaconvert-arn:queues/Default",
    UserMetadata= {},
    Role= os.environ[“mediaconvertrole”],
    Settings= {
      "OutputGroups": [
        {
          "CustomName": "thumbnails",
          "Name": "File Group",
          "Outputs": [
            {
              "ContainerSettings": {
                "Container": "RAW"
              },
              "VideoDescription": {
                "Width": 266,
                "ScalingBehavior": "DEFAULT",
                "Height": 150,
                "TimecodeInsertion": "DISABLED",
                "AntiAlias": "ENABLED",
                "Sharpness": 50,
                "CodecSettings": {
                  "Codec": "FRAME_CAPTURE",
                  "FrameCaptureSettings": {
                    "FramerateNumerator": 1,
                    "FramerateDenominator": 10,
                    "MaxCaptures": 20,
                    "Quality": 80
                  }
                },
                "DropFrameTimecode": "ENABLED",
                "ColorMetadata": "INSERT"
              },
              "Extension": "jpg"
            },
            {
              "ContainerSettings": {
                "Container": "MP4",
                "Mp4Settings": {
                  "CslgAtom": "INCLUDE",
                  "CttsVersion": 0,
                  "FreeSpaceBox": "EXCLUDE",
                  "MoovPlacement": "PROGRESSIVE_DOWNLOAD"
                }
              },
              "VideoDescription": {
                "Width": 128,
                "ScalingBehavior": "DEFAULT",
                "Height": 96,
                "TimecodeInsertion": "DISABLED",
                "AntiAlias": "ENABLED",
                "Sharpness": 50,
                "CodecSettings": {
                  "Codec": "H_264",
                  "H264Settings": {
                    "InterlaceMode": "PROGRESSIVE",
                    "NumberReferenceFrames": 3,
                    "Syntax": "DEFAULT",
                    "Softness": 0,
                    "GopClosedCadence": 1,
                    "GopSize": 90,
                    "Slices": 1,
                    "GopBReference": "DISABLED",
                    "SlowPal": "DISABLED",
                    "SpatialAdaptiveQuantization": "ENABLED",
                    "TemporalAdaptiveQuantization": "ENABLED",
                    "FlickerAdaptiveQuantization": "DISABLED",
                    "EntropyEncoding": "CABAC",
                    "Bitrate": 200000,
                    "FramerateControl": "INITIALIZE_FROM_SOURCE",
                    "RateControlMode": "CBR",
                    "CodecProfile": "MAIN",
                    "Telecine": "NONE",
                    "MinIInterval": 0,
                    "AdaptiveQuantization": "HIGH",
                    "CodecLevel": "AUTO",
                    "FieldEncoding": "PAFF",
                    "SceneChangeDetect": "ENABLED",
                    "QualityTuningLevel": "SINGLE_PASS",
                    "FramerateConversionAlgorithm": "DUPLICATE_DROP",
                    "UnregisteredSeiTimecode": "DISABLED",
                    "GopSizeUnits": "FRAMES",
                    "ParControl": "INITIALIZE_FROM_SOURCE",
                    "NumberBFramesBetweenReferenceFrames": 2,
                    "RepeatPps": "DISABLED",
                    "DynamicSubGop": "STATIC"
                  }
                },
                "AfdSignaling": "NONE",
                "DropFrameTimecode": "ENABLED",
                "RespondToAfd": "NONE",
                "ColorMetadata": "INSERT"
              },
              "Extension": "mp4"
            }
          ],
          "OutputGroupSettings": {
            "Type": "FILE_GROUP_SETTINGS",
            "FileGroupSettings": {
              #"Destination": "s3://felicitous-bucket/thumbs/gif/$fn$/"
              "Destination": "s3://my-bucket/thumbs/gif/$fn$/"
            }
          }
        }
      ],
      "AdAvailOffset": 0,
      "Inputs": [
        {
          "InputClippings": [
            {
              "EndTimecode": "00:05:00:00",
              "StartTimecode": "00:01:00:00"
            }
          ],
          "AudioSelectors": {
            "Audio Selector 1": {
              "Offset": 0,
              "DefaultSelection": "DEFAULT",
              "ProgramSelection": 1
            }
          },
          "VideoSelector": {
            "ColorSpace": "FOLLOW",
            "Rotate": "DEGREE_0",
            "AlphaBehavior": "DISCARD"
          },
          "FilterEnable": "AUTO",
          "PsiControl": "IGNORE_PSI",
          "FilterStrength": 0,
          "DeblockFilter": "DISABLED",
          "DenoiseFilter": "DISABLED",
          "TimecodeSource": "ZEROBASED",
          #"FileInput": "s3://felicitous-bucket/raw-video-input/BigBangTheory.mp4"
          "FileInput": "s3://"+ bucketname + "/" + key
        }
      ]
    },
    AccelerationSettings= {
      "Mode": "DISABLED"
    },
    StatusUpdateInterval= "SECONDS_60",
    Priority= 0

    
    )
  print(response)
    
    
def lambda_handler(event, context):
    print(json.dumps(event))
    bucket=event['bucket']
    key=event['key']
    convert_to_jpg(bucketname=bucket,key=key)


Name: mediaconvertendpoint
Value: 'https://xxxxxx.mediaconvert.us-east-1.amazonaws.com
Name: mediaconvertrole
Value: arn:aws:iam::xxxxxxxxxx:role/MediaConvert-role


f. Configure Test Event to test the code

3.4. Lambda Function 4:

This Lambda function converts the extracted JPEG thumbnail images into a GIF file and stores it in S3 bucket.

a. To create the Lambda function, go to the Management Console and find Lambda.
b. This Lambda function is being triggered by another Lambda function (Lambda Function 2), hence no need to add a trigger here.
c. Add Execution Role for S3 bucket access

Demonstration of the Execution role of Lambda function #4 JPEG2GIF
d. Code Snippets:

import os
import imageio 
import boto3
import json


def dowloadFile(bucket, key, filename):
    s3 = boto3.client('s3')
    s3.download_file(bucket, key, filename)


def uploadFile(path, bucket, key):
    s3 = boto3.resource('s3')
    s3.meta.client.upload_file(path, bucket, key)
    print("uploaded: ", bucket, key)

def generateGif(inputimages, outpath):

    images = []
    for inputimage in inputimages:
        images.append(imageio.imread(inputimage))
    imageio.mimsave(outpath, images)
    return outpath
    
def downloadImagesLocally(bucket, keys):
    s3 = boto3.resource('s3')
    counter = 0
    images = []
    for key in keys:
        print(key)
        file = "/tmp/" + str(counter) + ".jpg"
        s3.meta.client.download_file(bucket, key, file)  
        counter += 1
        images.append(file)
    print("images:  ", images)
    return images

e. Configure Test event to test the code.

3.5. Lambda Function 5:

This Lambda function returns the JSON files to API Gateway as a response to GET Object request to the API Gateway.

a. To create the Lambda function, go to the Management Console and find Lambda.
b. Add API Gateway as the trigger:

A screenshot showing the configuration window of Lambda function # 5 FetchJSON. API Gateway is set as the trigger.
c. Add Execution Role for S3 bucket access and Lambda execution.

A screenshot showing the Execution role of Lambda function #5
d. Code snippet

import json
import boto3

def fetchfile (bucket, key):
    client = boto3.client('s3')
    response = client.get_object(
        Bucket= bucket,
        Key= key
    )
    jsonasRawText = response['Body'].read().decode('utf-8')
    loadedDoc = json.loads(jsonasRawText)
    return loadedDoc


def lambda_handler(event, context):
    # TODO implement
    print(json.dumps(event))
    bucket = 'felicitous-bucket'
    key = event['queryStringParameters']['jsonpath']
    
    document = fetchfile (bucket, key)
    print("response: ", document)
    return {
        'statusCode': 200,
        'headers': 
            {
            'Access-Control-Allow-Origin' : '*', # Required for CORS support to work
            'Access-Control-Allow-Credentials' : True # Required for cookies, authorization headers with HTTPS 
          },        
        'body': json.dumps(document)
    }

e. Configure Test event to test the code.

4. Create CloudFront Distribution:

In this section, we create a CloudFront distribution that enables you to access the video files in S3 bucket securely, while reducing latency. The Origin Point for CloudFront is the S3 bucket created in step 1.

Amazon CloudFront is a web service that gives businesses and web application developers a way to distribute content with low latency and high data transfer speeds. Like other AWS services, Amazon CloudFront is a self-service, pay-per-use offering, requiring no long-term commitments or minimum fees. With CloudFront, your files are delivered to end-users using a global network of edge locations.

a. In the Management Console, find and select CloudFront.
b. Under Distributions, select Create Distribution

A screenshot showing CloudFront Distributions configuration window.

c. Select Web as the delivery method for the CloudFront Distribution, and select Get Started. We choose Web vs RTMP because we want to deliver media content stored in S3 using HTTPs.
d. Configure basic Origin Settings:

i. Origin Domain Name: example: newbucket-may-2020.amazonaws.com
ii. Origin ID: Custom-newbucket-may-2020.amazonaws.com
iii. Origin Protocol Policy: HTTPS Only
iv. Viewer Protocol Policy: Redirect HTTP to HTTPS

A screenshot showing CloudFront Create Distributions configuration window.

 A screenshot showing CloudFront Default Cache Behavior Settings configuration window. Viewer Protocol Policy set to Redirect HTTP to HTTPS, Allowed HTTP Methods set to GET, HEAD. 5. Configure API Gateway:

In this solution, when a viewer selects a video, content is requested in the webpage through the browser, and the request is then sent to the API Gateway and CloudFront distribution. The request to the API Gateway is passed as GET method to Lambda function, which in turn retrieves the JSON files from S3, and sends them back to API GW as a response.

This is key as the solution scope expands and becomes more dynamic, and to enable retrieval of metadata that can be stored in databases such as DynamoDB. Amazon API Gateway provides developers with a simple, flexible, fully managed, pay-as-you-go service that handles all aspects of creating and operating robust APIs for application back ends. With API Gateway, you can launch new services faster and with reduced investment so you can focus on building your core business services.

a. In the Management Console, find and select API Gateway
b. In the API Gateway console, select Create API:

A screenshot showing API Gateway Create API configuration window.

c. Choose REST API and select Build:

A screenshot showing API Gateway configuration window. Choose and API Type, REST API, followed by clicking the Build button.

d. From Actions menu, choose Create method and select GET as the method of choice:

A screenshot showing API Gateway configuration window. Choosing GET as the method for Actions.

e. Choose Lambda as the Integration point, and select your Region and the Lambda function to integrate with. A list of your existing Lambda functions will come up as you start typing the name of the Lambda function that will retrieve the JSON files from S3. Then choose Save.

A screenshot showing API Gateway configuration window to setup GET method. Choosing Integration type to be Lambda Function. Choosing Lambda function from drop down list.

f. Once you choose Save, a window that shows the different stages of the GET method execution should come up. This enables you to edit each stage if needed, in addition to testing by selecting the test button (optional).

A screenshot showing API Gateway configuration window for GET Method execution. Choosing Method Request block.

g. Select the Method Request block, and add a new query string; jsonpath.

g. Click on the Method Request block, and add a new query string; jsonpath.

h. Choose the Integration Request block, and select the Use Lambda Proxy Integration box.

A screenshot showing API Gateway configuration window for GET Method Integration. Choosing the Use Lambda Proxy Integration option.

 

i. Next, select the Actions tab and choose Deploy API to create a new stage. In the pop-up, enter the Stage name as “production” and Stage description as “Production”. Select the Deploy button.

6. Web User Interface (UI):

This section describes how to create a simple web interface that looks similar to the following

A screenshot showing the web browser with the main video and multiple video thumbnails displayed.

The client-side UI is built as a web application that creates a player for the video file, GIF file, and exposes the labels present in the JSON file.

Creating GIFs as preview to the video is optional, and simple images or links can be used instead.

To achieve this, the application makes a request to render video content, this request goes through CloudFront and API Gateway. The response includes the video file, in addition to the JSON index and JSON labels files.

The application then runs through the JSON Labels file, and looks for labels with existing bounding box coordinates, and then over-lays the video with rectangular bounding boxes by matching the timestamp, in addition to displaying the labels as hyperlinks underneath the video, enabling viewers to interact with products and directing them to eCommerce website immediately.

Labels are exposed only with ‘mouse-on’, to ensure a seamless experience for viewers.

The output of the rendering looks similar to the below.

A screenshot showing a frame from a video with labels for Couch and Person displayed underneath the video.

By selecting any of the labels extracted, example ‘Couch’, the web navigates to https://www.amazon.com/s?k=Couch displaying couches as a search result:

A screenshot showing amazon.com page displaying couches.

HTML Script snippets:

<html lang="en">

<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>GEM Streamingt</title>

  <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.js"></script>

  <link rel="stylesheet" type="text/css" href="/css/result-light.css">

  <link rel="stylesheet" type="text/css"
    href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.11.2/css/all.min.css">


  <link rel="stylesheet" href="style.css" />

  <style id="compiled-css" type="text/css">
    #c {
      position: absolute;
      left: 9px;
      top: 1500;
      z-index: 100;
      pointer-events: none;
    }

    #v {
      position: absolute;
      left: 25px;
      top: 1500;
      z-index: 1;
    }

    #labels {
      position: absolute;
      top: 470px;
      left: 25px;
    }


for (i = 0; i < JSON.length; i++) {
          var element = JSON[i]
          if (element.Timestamp > minTime && element.Timestamp < maxTime) {
            if (labelElement.innerHTML != "") {
              var newLabel = element.Label.Name.link("https://www.amazon.com/s?k=" + element.Label.Name);
              labelElement.innerHTML = labelElement.innerHTML + ", " + newLabel;
            }
            else {
              labelElement.innerHTML = element.Label.Name.link("https://www.amazon.com/s?k=" + element.Label.Name);
            }
            if (element.Label.Instances.length > 0) {
              for (j = 0; j < element.Label.Instances.length; j++) {
                var box = element.Label.Instances[j].BoundingBox
                box.Width = Math.round(box.Width * vWidth)
                box.Height = Math.round(box.Height * vHeight)
                box.Top = Math.round(box.Top * vHeight) 
                box.Left = Math.round(box.Left * vWidth)
                ctx.strokeRect(box.Left, box.Top, box.Width, box.Height)
                console.log(box)
              }
            }

Cleanup steps:

a. Delete the Lambda functions that were created in the earlier step:

i. Navigate to Lambda in the AWS Console. Search for the lambda function by name. Select the function and choose delete.

b. Delete the API that was created earlier in API Gateway:

i. Navigate to API Gateway. Locate the API. Choose delete.

c. Delete the Cloudfront Distribution

i. Navigate to Cloudfront. Select the Cloudfront distribution that was created earlier. Select Delete.

d. Delete the S3 bucket.

i. Navigate to the S3 bucket. Select the bucket. Select Empty. When the object deletion is complete, select the bucket again, and choose delete.

e. Delete the SNS topics that were created earlier:

i. Go to SNS. Navigate to Topics. Find the topics listed above. Select Delete.

APPENDIX – A: JSON Files

All Index JSON file:
This file indexes the video files as they are added to S3, and includes paths to the video file, GIF file, and labels file.

Screenshot of file indexes when added to S3

Extracted Labels JSON file:
The following snippet shows the JSON file as an output of Rekognition Video job.

Screenshot showing JSON file snippet for the Labels from the processed video.

An example of a label in the demo is for a Laptop, the following snippet from the JSON file shows the construct for it. Key attributes include Timestamp, Name of the label, confidence (we configured the label extraction to take place for confidence exceeding 75%), and bounding box coordinates.

Screenshot showing the construct of the label in the JSON file. highlights Name: "Laptop" and other aspects such as confidence and bounding box coordinates.

 

 

Noor Hassan

Noor Hassan

Noor Hassan - Sr. Partner SA - Toronto, Canada. Background in Media Broadcast - focus on media contribution and distribution, and passion for AI/ML in the media space. Outside of work I enjoy travel, photography, and spending time with loved ones.

Daniel Duplessis

Daniel Duplessis

Daniel Duplessis is a Senior Partner Solutions Architect, based out of Toronto. His technical focus areas are Machine Learning and Serverless. Outside of work he likes to play racquet sports, travel and go on hikes with his family.