Extracting Video Metadata using Lambda and Mediainfo

Michael Raposa
Principal Consultant

For a video asset, technical metadata includes information such as video codec, audio codec, resolution, frame rate, aspect ratio, and a host of other very detailed technical information. With technical metadata, customers can make intelligent decisions about what to do next in their workflow. The asset may be transcoded to a standard “house” format, resized, or have a quality check performed to make sure it is in an acceptable format. The technical metadata may also be stored for indexing and fast retrieval in a database service like Amazon DynamoDB or Amazon RDS.

When customers work with media assets like video and audio files on Amazon S3, there is a typical workflow where assets get uploaded to S3, S3 triggers an event for that upload, and that event triggers an AWS Lambda function, which can extract technical metadata from the asset on S3.

In this post, I walk you through the process of setting up a workflow to extract technical metadata from multimedia files uploaded to S3 using Lambda.

Extraction process

The extraction of technical metadata from the asset on S3 becomes the tricky part in this workflow. You need a way to seamlessly download the minimum number of bytes from the file in order to extract the technical metadata. If you download the entire file, you run into issues with either the limited temp space available to Lambda or run into the 300s timeout limit when dealing with long running functions.

Customers have come up with inventive ways to get around these problems. One typical solution downloads just the first few bytes from the head and tail of the file. To accomplish this limited download, customers use a feature available in S3 called ranged gets that allows you to specify the bytes of a file to download. These downloaded bytes are then concatenated and saved to the Lambda temp storage. A tool like ffprobe is then run against these concatenated bytes to extract the technical metadata.

There are several issues with this process. It’s not immediately apparent which specific bytes to download from the head and the tail of the file. Also, many customers prefer not to store their assets on the local temporary storage available with Lambda. Finally, you may prefer to do all of the processing in memory, which decreases Lambda processing time and hence reduces your costs.

MediaInfo

A more elegant and less brittle solution uses MediaInfo, a popular free and open-source program that extracts technical metadata about media assets for many audio and video files.

Using MediaInfo has several significant benefits:

Curl functionality.

You can pass a simple URL as an input parameter and MediaInfo downloads only the necessary information from the file on S3 in order to get the technical metadata. For example, MediaInfo processed a 1.6 GB video file in 2.5 seconds and the entire function only used 46 MB of memory. Only parts of the file are downloaded, processed in memory, and never saved to disk.

An abundance of technical metadata information about the asset.
XML export format for easier processing programmatically.
Static compilation to include all dependencies into a single executable that runs in Lambda functions. In this post, I walk through how to compile MediaInfo for Lambda.

Solution overview

The new workflow looks like the following diagram:

In this workflow:

A multimedia file is uploaded to an S3 bucket.
Using S3 event notifications, S3 triggers a Lambda function.
The LambdaMediaInfo Lambda function, which you create below, is executed.
The Lambda function generates a S3 signed URL, which is passed as an input to MediaInfo. MediaInfo downloads the minimum bytes necessary of the multimedia file stored on S3.
Using MediaInfo, the Lambda function then extracts the technical metadata from the multimedia file. The technical metadata is then stored in DynamoDB in XML format.

The rest of this post shows the steps necessary to create this workflow on AWS.

Step 1: Compile MediaInfo for Amazon Linux

To start, you need an instance running the same version of Amazon Linux as used by AWS Lambda. You can find the AMI for this version and for your region at Lambda Execution Environment and Available Libraries.

This instance is used only to compile MediaInfo and is terminated as soon as you finish this step. Any instance type is fine, but you may want to launch at least a t2.medium for compiling software that is fairly CPU intensive. On a t2.medium, MediaInfo compiles in about 2 minutes.

Here is a sample command to launch an instance in US East (N. Virginia):

aws ec2 run-instances \
    --image-id ami-60b6c60a \
    --count 1 \
    --instance-type t2.medium \
    --key-name YourKeyPair \
    --security-group-ids sg-xxxxxxxx \
    --subnet-id subnet-xxxxxxxx

After you have launched your instance, SSH into it. For more information, see Getting Started with Amazon EC2 Linux Instances.

After you connect to the instance, execute the following commands. These commands install the necessary libraries to compile MediaInfo, download and extract the MediaInfo source code, and finally compile MediaInfo into a static binary.

# Install Development Tools necessary to compile MediaInfo
sudo yum groupinstall 'Development Tools'
# Install library required to add CURL support to Mediainfo
sudo yum install libcurl-devel

# Download MediaInfo
wget http://mediaarea.net/download/binary/mediainfo/0.7.84/MediaInfo_CLI_0.7.84_GNU_FromSource.tar.xz
# Untar MediaInfo
tar xvf MediaInfo_CLI_0.7.84_GNU_FromSource.tar.xz
cd MediaInfo_CLI_GNU_FromSource

# Compile MediaInfo with Support for URL Inputs
./CLI_Compile.sh --with-libcurl

Notice that version 0.7.84 of MediaInfo is being downloaded. That is the latest version as of this post. You can find the current source code at Download – Sources on the MediaInfo website.

MediaInfo is being compiled with the “–with-libcurl” option. This option adds URL support to MediaInfo and allows you to pass a URL as an input. This is important later when it is time to access files on S3.

After the compile is complete, a static, standalone version of MediaInfo is created. Run the following two commands to confirm successful compilation:

cd MediaInfo/Project/GNU/CLI

./mediainfo --version

You should get an output that looks like the following:

MediaInfo Command line,
MediaInfoLib - v0.7.84

You now need to copy the MediaInfo executable off the instance and down to your local workstation. Later, you bundle this executable with your Lambda function. You can use any convenient method to download: e.g., scp, ftp, copy to S3, etc.

Step 2: Create the DynamoDB table

You create a DynamoDB table to store the technical metadata from the asset. The keyName value, which is the S3 key of the asset, is used as a HASH key. The table name is TechnicalMetadata.

To create the table, execute the following:

aws dynamodb create-table \
    --table-name TechnicalMetadata \
    --attribute-definitions \
        AttributeName=keyName,AttributeType=S \
    --key-schema AttributeName=keyName,KeyType=HASH \
    --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1

Note the “TableArn” value output by this command, as you need that value in a later step.

Step 3: Write the Lambda function

Here is the Lambda function written in Python that you use:

import logging
import subprocess

import boto3

SIGNED_URL_EXPIRATION = 300     # The number of seconds that the Signed URL is valid
DYNAMODB_TABLE_NAME = "TechnicalMetadata"
DYNAMO = boto3.resource("dynamodb")
TABLE = DYNAMO.Table(DYNAMODB_TABLE_NAME)

logger = logging.getLogger('boto3')
logger.setLevel(logging.INFO)


def lambda_handler(event, context):
    """

    :param event:
    :param context:
    """
    # Loop through records provided by S3 Event trigger
    for s3_record in event['Records']:
        logger.info("Working on new s3_record...")
        # Extract the Key and Bucket names for the asset uploaded to S3
        key = s3_record['s3']['object']['key']
        bucket = s3_record['s3']['bucket']['name']
        logger.info("Bucket: {} \t Key: {}".format(bucket, key))
        # Generate a signed URL for the uploaded asset
        signed_url = get_signed_url(SIGNED_URL_EXPIRATION, bucket, key)
        logger.info("Signed URL: {}".format(signed_url))
        # Launch MediaInfo
        # Pass the signed URL of the uploaded asset to MediaInfo as an input
        # MediaInfo will extract the technical metadata from the asset
        # The extracted metadata will be outputted in XML format and
        # stored in the variable xml_output
        xml_output = subprocess.check_output(["./mediainfo", "--full", "--output=XML", signed_url])
        logger.info("Output: {}".format(xml_output))
        save_record(key, xml_output)

def save_record(key, xml_output):
    """
    Save record to DynamoDB

    :param key:         S3 Key Name
    :param xml_output:  Technical Metadata in XML Format
    :return:
    """
    logger.info("Saving record to DynamoDB...")
    TABLE.put_item(
       Item={
            'keyName': key,
            'technicalMetadata': xml_output
        }
    )
    logger.info("Saved record to DynamoDB")


def get_signed_url(expires_in, bucket, obj):
    """
    Generate a signed URL
    :param expires_in:  URL Expiration time in seconds
    :param bucket:
    :param obj:         S3 Key name
    :return:            Signed URL
    """
    s3_cli = boto3.client("s3")
    presigned_url = s3_cli.generate_presigned_url('get_object', Params={'Bucket': bucket, 'Key': obj},
                                                  ExpiresIn=expires_in)
    return presigned_url

The Lambda function starts by executing the lambda_handler function. The handler loops through records that have been passed to the function by S3. Each record represents a file upload to S3. From each record, the function extracts the bucket and key.

Then the function get_signed_url is called with the bucket and key, and generates a Signed URL for the file on S3. A signed URL allows MediaInfo to access the file securely on S3 via a URL. For more information, see Share an Object with Others.

The signed URL is passed to MediaInfo, which then downloads only the bytes necessary to extract the technical metadata from the file. All of the metadata is outputted as an XML, which can be used for further processing.

Lambda calls save_record to store the technical metadata in DynamoDB. Storing the data in DynamoDB makes future queries and searches on the data much easier.

Step 4: Deploy the Lambda function

Save the code above as lambda_function.py. Then zip that file AND the mediainfo executable into Lambda.zip. The command on Mac OSX is the following:

zip Lambda lambda_function.py mediainfo

The command on your operating system may be different.

Next, create the execution role that Lambda uses when it runs. First, create the actual role:

aws iam create-role \
    --role-name LambdaMediaInfoExecutionRole \
    --assume-role-policy-document file://lambda_trust_policy.json

The trust policy for this role allows Lambda to assume the role. Put the following trust policy document in a file named lambda_trust_policy.json.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create an access policy to be assigned to the role. Change the policy-arn value to the ARN of the LambdaMediaInfoExecutionRolePolicy.

aws iam create-policy \
    --policy-name LambdaMediaInfoExecutionRolePolicy \
    --policy-document file://lambda_access_policy.json

aws iam attach-role-policy \
    --role-name LambdaMediaInfoExecutionRole \
    --policy-arn XXXXXXXXXXXX

When you create this access policy, give the Lambda function the minimum rights required to store Lambda logs to CloudWatch Logs, read files from a specific S3 bucket, and store the technical metadata in the DynamoDB table created in Step 2. Put the following access policy document in a file named lambda_access_policy.json.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:logs:*:*:*"
        },
        {
            "Action": [
              "s3:GetObject"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::YOUR_BUCKET/*"
        },
        {
            "Sid": "PutUpdateDeleteOnCrossAccountAuditing",
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem",
                "dynamodb:UpdateItem",
                "dynamodb:DeleteItem"
            ],
            "Resource": "DYNAMODB_ARN"
        }
    ]
}

You need to edit “YOUR_BUCKET” and change it to the bucket location where you are storing your files for MediaInfo to process. You also need to edit “DYNAMODB_ARN” to the ARN of the DynamoDB table that was created in Step 2.

Finally, you can create the Lambda function. Change the “role” parameter to the ARN of the LambdaMediaInfoExecutionRole role.

aws lambda create-function \
    --function-name LambdaMediaInfo \
    --runtime python2.7 \
    --role XXXXXXXXXXXX \
    --handler lambda_function.lambda_handler \
    --description "Lambda MediaInfo Function" \
    --timeout 60 \
    --memory-size 128 \
    --zip-file fileb://Lambda.zip

Step 5: Configure S3 to trigger the Lambda function

You need to configure S3 to trigger the Lambda function whenever a new media asset is uploaded to the bucket.

The first step is to add permission to Lambda so that S3 can invoke the function.

aws lambda add-permission \
    --function-name LambdaMediaInfo \
    --statement-id Id-1 \
    --action "lambda:InvokeFunction" \
    --principal s3.amazonaws.com \
    --source-arn arn:aws:s3:::YOUR_BUCKET \
    --source-account YOUR_ACCOUNT_NUMBER

Modify “YOUR_BUCKET” with the location of your media assets and “YOUR_ACCOUNT_NUMBER” with your twelve-digit AWS account number.

Next, add a notification to the S3 bucket that triggers the Lambda function. Modify “YOUR_BUCKET” with the location of your media assets.

aws s3api put-bucket-notification \
    --bucket YOUR_BUCKET \
    --notification-configuration file://notification.json

Put the following notification configuration document in a file named notification.json.

{
  "CloudFunctionConfiguration": {
    "Id": "ObjectCreatedEvents",
    "Events": [ "s3:ObjectCreated:*" ],
    "CloudFunction": "LAMBDA_ARN"
  }
}

You need to replace “LAMBDA_ARN” with the ARN of the LambdaMediaInfo function. This notification configuration triggers the Lambda function whenever a file is created in the S3 bucket.

Step 6: Test

To test, upload a video file to the S3 bucket. S3 then triggers the Lambda function and MediaInfo extracts the technical metadata.

To confirm that the function is working as expected, you can view the CloudWatch logs for the Lambda function.

The following is an excerpt of the logs for a 1.65GB movie file called test-video2.mov, which includes the technical metadata in XML format extracted by MediaInfo. For brevity, much of the metadata is truncated. However, you can see that the video file is in MPEG-4/AVC format, is 57mins 47seconds long, and has a 16:10 aspect ratio.

Also, notice how the Lambda function only took 2.5 seconds and 46 MB of RAM to process the entire 1.65GB file. MediaInfo is able to accomplish fast processing time and low memory utilization by downloading only parts of the video file, as opposed to the entire file. The reduction in time and memory immediately translates into a reduction in Lambda transaction costs.

[INFO] 2016-04-18T20:12:13.74Z  Working on new s3_record...
[INFO] 2016-04-18T20:12:13.74Z  Bucket: My.Bucket Key: test-video2.mov
[INFO] 2016-04-18T20:12:13.165Z Signed URL: https://s3.amazonaws.com/My.Bucket/test-video2.mov?AWSAccessKeyId=ASIAIS5OGWVK2HTNCS6Q&Expires=1461010633&x-amz-security-token=SAMPLE&Signature=SAMPLE%3D
[INFO] 2016-04-18T20:12:15.524Z Output:
<?xml version="1.0" encoding="UTF-8"?>
<Mediainfo version="0.1"
           ref="https://s3.amazonaws.com/My.Bucket/test-video2.mov?AWSAccessKeyId=ASIAIS5OGWVK2HTNCS6Q&amp;Expires=1461010633&amp;x-amz-security-token=SAMPLE&amp;Signature=SAMPLE%3D">
    <File>
        <track type="General">
            …Truncated… 
            <Format>MPEG-4</Format>
            …Truncated…        
            <Duration>57mn 41s</Duration>
            …Truncated… 
            <Display_aspect_ratio>16:10</Display_aspect_ratio>
            <Rotation>0.000</Rotation>
            <Frame_rate_mode>VFR</Frame_rate_mode>
            <Frame_rate_mode>Variable</Frame_rate_mode>
            <Frame_rate>35.139</Frame_rate>
            <Frame_rate>35.139 fps</Frame_rate>
            <Minimum_frame_rate>5.000</Minimum_frame_rate>
            <Minimum_frame_rate>5.000 fps</Minimum_frame_rate>
            <Maximum_frame_rate>60.000</Maximum_frame_rate>
            <Maximum_frame_rate>60.000 fps</Maximum_frame_rate>
            <Original_frame_rate>25.000</Original_frame_rate>
            <Original_frame_rate>25.000 fps</Original_frame_rate>
            <Frame_count>121645</Frame_count>
            <Resolution>8</Resolution>
            <Resolution>8 bits</Resolution>
            <Colorimetry>4:2:0</Colorimetry>
            <Color_space>YUV</Color_space>
            <Chroma_subsampling>4:2:0</Chroma_subsampling>
            <Chroma_subsampling>4:2:0</Chroma_subsampling>
            <Bit_depth>8</Bit_depth>
            <Bit_depth>8 bits</Bit_depth>
            <Scan_type>Progressive</Scan_type>
            <Scan_type>Progressive</Scan_type>
            <Interlacement>PPF</Interlacement>
            <Interlacement>Progressive</Interlacement>
            <Bits__Pixel_Frame_>0.028</Bits__Pixel_Frame_>
            <Stream_size>1765037794</Stream_size>
            <Stream_size>1.64 GiB (100%)</Stream_size>
            <Stream_size>2 GiB</Stream_size>
            <Stream_size>1.6 GiB</Stream_size>
            <Stream_size>1.64 GiB</Stream_size>
            <Stream_size>1.644 GiB</Stream_size>
            <Stream_size>1.64 GiB (100%)</Stream_size>
            <Proportion_of_this_stream>0.99876</Proportion_of_this_stream>
            <Title>Core Media Video</Title>
            <Encoded_date>UTC 2015-05-18 19:34:58</Encoded_date>
            <Tagged_date>UTC 2015-05-18 19:35:02</Tagged_date>
            <Buffer_size>768000</Buffer_size>
            <Color_range>Limited</Color_range>
            <colour_description_present>Yes</colour_description_present>
            <Color_primaries>BT.709</Color_primaries>
            <Transfer_characteristics>BT.709</Transfer_characteristics>
            <Matrix_coefficients>BT.709</Matrix_coefficients>
        </track>
    </File>
</Mediainfo>
END RequestId: d696f655-05a1-11e6-8fa7-976f9c7f7486
REPORT RequestId: d696f655-05a1-11e6-8fa7-976f9c7f7486 Duration: 2451.40 ms   Billed Duration: 2500 ms Memory Size: 128 MB   Max Memory Used: 46 MB

Finally, here is a screenshot of the technical metadata being saved to DynamoDB. Note the partial portion of the technical metadata XML stored in the technicalMetadata attribute.

Conclusion

In this post, I simplified a cumbersome and brittle workflow for extracting technical metadata from files into a more elegant solution, using MediaInfo. In addition, the technical metadata is stored in DynamoDB for quick retrieval and searching. You can use this metadata to make intelligent decisions in subsequent downstream workflow steps, such as asset transcoding, image resizing, and general file verification and quality control.

To learn more, see Configuring S3 Event Notifications and Using Lambda with S3. If you have comments or suggestions, please submit them below.

AWS Compute Blog